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The Challenge 
of Mount improbable 


A Special issue ot CRSG 


Kevin Anderson and Jean Lightner 


ore than 150 years ago, Charles 

Darwin proposed his ideas on 
the origin of species. He concluded that 
some traits could benefit organisms (e.g., 
make them faster) and some traits could 
hinder organisms (e.g., make them slow- 
er). Thus, those possessing beneficial 
traits had a greater chance of survival, 
and those with detrimental traits had a 
reduced chance of survival. The essence 
of this conclusion has become known as 
natural selection. 

Darwin saw that variation in traits 
exists, but since his studies predated 
the field of genetics, he had no real 
understanding of the underlying basis 
for this variation. Differences in traits 
were taken as a given, and differences 
between species were generally attrib- 
uted to the effects of natural selection. 
Even today some biologists claim that a 
particular trait arose by natural selection, 
as if traits magically appear when needed 
so natural selection can act upon them. 

Not bound by any laws of heredity or 
an understanding of molecular genetics, 
Darwin saw “no reason to limit” the types 
of changes that an organism could un- 
dergo (Darwin, 1999, p. 127). With no 
limits, Darwin made some rather bold 
assumptions, not the least of which is 
that these changes could be so dramatic 


as to transform fish into amphibians or 
reptiles into mammals. He referred to 
this unlimited change as “descent with 
modification” (Darwin, 1999, p. 126). 
With this presumption, Darwin had 
outlined the basic concepts of universal 
common descent—the idea that all 
“animals and plants have descended 
from some one prototype,” and all life 
shares a common ancestry (Darwin, 
1999, p. 472). 

About the same time that Darwin 
published Origin of the Species, an Aus- 
trian monk was conducting experiments 
with pea plants. From these experiments, 
Gregor Mendel observed that peas con- 
tain something he called factors, which 
caused the plant to grow tall or short and 
the pea pod to be yellow or green. These 
factors were also passed on to subsequent 
generations, affecting their growth and 
color as well. 

Mendel presented his findings at a 
meeting of Austria’s Briinn Society, but 
few listeners probably comprehended 
much of what they heard that day. Fol- 
lowing his presentation, Mendel’s work 
was published in the 1866 Proceedings 
of the Briinn Society. In an attempt to 
publicize his work, Mendel sent copies 
of his article to several well-known bota- 
nists and other naturalists. Interestingly, 


a copy of Mendel’s article was reportedly 
found in Darwin’s library. However, 
the copy was “uncut,” indicating that 
Darwin had not even taken the time to 
slit the pages apart and read it (Henig, 
2000). In fairness to Darwin, it would 
seem reasonable that, like most other 
naturalists of that time, even had he read 
Mendel’s work, he would have had little 
comprehension of its significance. In 
fact, nearly fifty years would pass before 
the importance of Mendel’s work would 
become recognized. During those same 
fifty years, though, Darwin’s work would 
gain wider and wider acceptance (not 
coincidentally, without any genetic basis 
of the changes Darwin proposed). 
What Mendel had unknowingly 
discovered was the inheritance patterns 
of chromosomal DNA. Mendel’s factors 
were actually different versions of genes 
(known as alleles) carried on chromo- 
somes. We now understand that changes 
to the chromosomal DNA nucleotide 
sequence (i.e., mutations) can alter 
the organism’s physical features. Some 
mutations may cause the organism to 
grow a little taller or camouflage a little 
better. Other mutations may weaken the 
organism, such as reducing its line of 
vision or physical strength. Some muta- 
tions are lethal, resulting in death, while 
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still others may be “neutral,” having no 
noticeable effect. 

As Mendel’s ideas became more 
broadly understood, evolutionists 
worked to incorporate this information 
into Darwinian teaching. In 1942, Julian 
Huxley introduced the term “modern 
synthesis” to reflect an updating of 
Darwinism to include basic Mendelian 
genetics (Huxley, 1942). Also known 
as neo-Darwinism, this updating was 
popularized by such noted evolutionists 
as George Gaylord Simpson and Ernst 
Mayr. However, even this updating pre- 
dated an understanding of gene activity 
and molecular biology (a very significant 
absence). 

Neo-Darwinism still represents 
the most popular form of evolution 
(almost exclusively what is presented 
in textbooks and journal publications). 
The most common version relies on ran- 
dom mutations to achieve the physical 
changes necessary for dramatic trans- 
formations; for example, amphibians 
evolving legs or birds evolving wings. 
However, random mutation alone cre- 
ates an unwinnable game of chance. 
Even evolution apologist Richard 
Dawkins observes that “it is grindingly, 
creakingly, crashingly obvious that, 
if Darwinism were really a theory of 
chance, it couldn’t work” (Dawkins, 
1996, p. 77). Instead, Dawkins argues 
that natural selection sorts through the 
various traits, favoring those benefiting 
the organism and casting aside those 
that are detrimental. ‘This combination 
is claimed to eliminate chance and 
achieve the needed changes. 

Dawkins compares the attainment of 
biological complexity to that of climb- 
ing a mountain (i.e., climbing Mount 
Improbable). He recognizes the sheer 
difficulty of climbing this mountain 
and acknowledges that “only God would 
essay the mad task of leaping up the 
precipice in a single bound” (Dawkins, 
1996, p. 77). He removes God from the 
process by envisioning the power of neo- 
Darwinism as its ability of 


breaking the improbability up into 
small, manageable parts, smearing 
out the luck needed, going around 
the back of Mount Improbable and 
crawling up the gentle slopes, inch 
by million-year inch. (Dawkins, 
1996, p. 77) 

As with most neo-Darwinists, 
Dawkins considers that the gradual 
accumulation of beneficial traits will 
slowly accomplish the task of originating 
enzymes, hormonal systems, respiratory 
organs, and brain activity. As such, ac- 
cording to Dawkins, random mutation 
combined with natural selection is 
equivalent to God. 

A problem Dawkins fails to recognize 
is the sheer task of climbing Mount 
Improbable, even by small, incremental 
steps. Thirty years ago Michael Denton 
mused that “the credibility of [Darwin- 
ism] is weakened,” not only by the high 
level of biological design known at that 
time, “but by the expectation of further 
as yet undreamt of depths of ingenuity 
and complexity” (Denton, 1986, p. 342). 
These expectations have certainly come 
to fruition. As a Berkeley biochemist la- 
ments, “It seems like we’re climbing a 
mountain that keeps getting higher and 
higher....'The more we know, the more 
we realize there is to know” (Hayden, 
2010, p. 664). Dawkins’s Mount Improb- 
able continues to get higher, steeper, 
and far more difficult to climb with 
each advancement of understanding. 
Since 1942, the growing knowledge of 
molecular genetics has not been kind to 
neo-Darwinism. 

In this special issue of the Creation 
Research Society Quarterly (CRSQ), 
Truman (2016) further highlights the 
unscalable nature of the mountain by of- 
fering an in-depth analysis of how living 
cells function as information processors. 
In this first of a two-part series, Truman 
describes the Boolean logic operations 
cells rely on, which involve multiple 
independent codes to accomplish the 
many and varied processes required to 
sustain life. Just as computer software 


does not design itself, neither do living 
cells design themselves. 

Climbing Mount Improbable re- 
quires the generation of new beneficial 
mutations (Stoltzfus and Yampolsky, 
2009), presumably resulting in new 
genes and genetic networks. Yet while 
beneficial mutations do occasionally ap- 
pear, they are ineffective in climbing this 
mountain. In this issue of the CRSQ, 
Anderson (20 16a) notes that certain pop- 
ular examples of beneficial mutations 
cannot accomplish the needed trans- 
formations; new genes simply are not 
formed. Instead, the adaptive changes 
are the result of shuffling of preexisting 
genes, loss of gene expression, and loss 
of gene regulation. This loss is more ap- 
propriately compared to descending the 
mountain (Anderson, 2016b). 

Interestingly, the absence of gene- 
forming mutations has led to the pro- 
posal that genetic loss is a driving force 
of Darwinian evolution. A “less-is-more” 
concept recognizes that most beneficial 
mutations are actually degenerative, not 
gene forming (Oh et al., 2015; Olson, 
1999; Wang et al., 2006; Zhu et al., 
2007). Organisms gain adaptive benefits 
from eliminating specific enzymes, 
regulatory systems, or transport proteins. 
However, universal common descent 
requires formation of new genetic 
systems. You cannot build specialized 
structures (such as feathers, legs, and 
wings) solely by eliminating preexisting 
genetic activity. Since the “less-is-more” 
concept cannot build the needed ge- 
netic systems, it assumes this is achieved 
by other, undocumented, mutations. 
Thus, the “less-ismore” model clearly 
is descending, not ascending Mount 
Improbable, but carries the assumption 
that the mountain must somehow have 
been scaled. 

Molecular biologist James Shapiro 
states that randomness was originally 
inserted into Darwinism to exclude any 
hint of a creator. He adamantly insists 
that evolution must move beyond this 
original thinking to now include non- 
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random, directing genetic programs 
(Shapiro, 2011). Clearly, even mutations 
underlying adaptive change, which are 
accepted by creationists, often do not 
appear to be the result of random muta- 
tion. In other words, the neo-Darwinian 
mechanism of random mutation does 
not even account for the biblically 
recognized changes that, for example, 
allowed the canids on the ark to diversify 
into foxes in the desert, arctic, and other 
regions of the globe. 

In this issue, Lightner (2016) sum- 
marizes some previous research in 
mammals suggesting that many muta- 
tions are not random and explores an 
enzyme used by the immune system to 
edit DNA in a highly regulated fashion. 
Again, while these mutations do not 
climb Mount Improbable, DNA editing 
enzymes may be able to explain why 
adaptive mutations are available when 
they are needed. This research lays a 
foundation for investigating whether this 
enzyme is at work in heritable germ-line 
mutations. 

It is not just the assumption of 
random mutations that fails to support 
a climb up the mountain. Natural 
selection also has been shown to be an 
unsuitable component of this mountain- 
climbing venture. Various estimates 
illustrate it would likely take millions of 
years for just a few beneficial mutations 
to become fixed within a population 
(Durrett and Schmidt, 2007; Sanford et 
al., 2015). This is a major obstacle for 
Darwinism. There is simply not enough 
time for dramatic transformations, even 
using its extensive timescale. In this is- 
sue, three of the media reviews discuss 
the role of natural selection as it relates 
to the natural history of several different 
species and briefly hint at other factors 
important in adaptive changes since 
the Flood. 

Several evolutionists have begun 
to agree with Shapiro (2011) and now 
recognize the need for a mechanism 
other than random mutation and natu- 
tal selection. The extended synthesis of 


evolution is an attempt to integrate 
additional mechanisms (such as trans- 
generational epigenetics and multilevel 
selection) to compensate for the weak- 
nesses of neo-Darwinism (Pigliucci and 
Miiller, 2010). Others have a stated 
goal of seeking what they call the “third 
way” —an alternative to neo-Darwinism 
and creation (www.thethirdwayofevolu- 
tion.com). By their own admission, they 
are still seeking. 

Marshall (2015, p. 224) also con- 
cedes the need for a different mecha- 
nism, concluding that “if computer 
simulations have taught us anything, 
it’s that gradual accidental ‘Darwinian’ 
processes never succeed in ‘climbing 
Mount Improbable.” Like Shapiro 
(2011), Marshall (2015) proposes that a 
directing program is part of an alternate 
version of evolution, which he labels 
evolution 2.0. This version entails speci- 
fied alterations of DNA in response to 
certain environmental cues. As such, the 
mechanism of version 2.0 involves the 
action of “modular systems programmed 
to make sudden dramatic changes” 
(Marshall, 2015, p. 224). 

If evolutionary changes are not 
blind and undirected, then where 
would such nonrandom genetic pro- 
gramming originate? Shapiro (2011) 
attempts to offer some scenarios, but 
ultimately fails, because his arguments 
require such programs to be an inevi- 
table product of primordial blind and 
random processes. This is a problem 
every bit as significant for evolution to 
achieve as the problem these programs 
are intended to explain. 

On the other hand, Marshall (2015) 
readily recognizes this problem and 
directly attributes the origin of these pro- 
grammed modular systems to a creator. 
Interestingly, these modular systems are 
not unlike systems a biblical creation 
model also employs (e.g., hybridization, 
gene transfer, and epigenetics). With 
such created systems, organisms can 
present a wide variety of phenotypic 
traits. 


However, Marshall still seeks to 
incorporate the unnecessary and ge- 
netically untenable baggage of universal 
common descent. He envisions that the 
action of multiple combinations of these 
modular systems will move an organism 
“from any one spot on the tree of life to 
any other” (Marshall, 2015, p. 144). Yet 
he is unable to support this claim with 
natural-occurring examples. No combi- 
nation has ever been shown to move an 
organism from any one location on the 
tree of life to any other given location. 
This is strictly conjecture and very poor 
conjecture at that. 

Using preexisting programs is not 
accurately an example of traversing 
Mount Improbable, since such a climb 
requires formation of new genes, new 
regulatory networks, and new genetic 
systems. Marshall’s (2015) version 2.0 
requires God to do all the “heavy climb- 
ing” but lacks any means of moving 
upward past this point on its own. There 
is no experimental data demonstrating 
his grand claim of moving anywhere 
along the tree of life (i-e., scaling the 
mountain’s entire slope). 

In addition, the bravado with which 
evolutionists claim significant evidence 
for common ancestry between humans 
and chimps is not in accord with the 
actual data. When comparing the 
DNA sequence between two or more 
species, certainly our imagination can 
always devise a story of how one species 
transforms into the other, regardless of 
their genetic relatedness. It is simply a 
matter of altering the DNA nucleotides 
as necessitated by the story. 

This ignores the reality that there 
is far more to genetic activity than the 
nucleotide sequence. Chromosome 
function is not just a simple linear code 
(Riva, 2014). Specific genomic sequenc- 
es are interweaved and multifunctional 
(Djebali et al., 2012). Interchromo- 
somal interactions (i.e., chromosome 
kissing) facilitate three-dimensional 
topological domains between different 
chromosomes for specific gene activity 
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(Choudhury, etal. 2015). Gene-reading 
frames are buried within other reading 
frames, and many genes contain overlap- 
ping regions across both strands of DNA 
(Zhao et al., 2015). All proposed mecha- 
nisms of universal descent collapse in 
the face of this virtually insurmountable 
mountain. 

What is more, changing the nucleo- 
tide arrangement can also disrupt proper 
cell function. Such disruptions can re- 
sult in disease or even death of the organ- 
ism. Thus, itis reasonable to suspect that 
there are many chromosomal regions 
where no path exists for the assumed 
DNA changes to occur. This is especially 
relevant when we consider that proper 
chromosome function is essential for a 
living organism. 

Yet the assumption of common 
ancestry is so strong that it overrides all 
contradictory evidence. For example, 
Bradley (2008) claims there is excellent 
molecular support for a human-ape 
divergence 4 to 8 million years ago. Yet 
she acknowledges that this molecular 
evidence is often in direct conflict with 
morphology-based evidence, causing 
evolutionists to dramatically alter their 
account of how humans and primates 
are related. She further describes the 
decades it took for evolutionists to sort 
out the alleged divergence of humans, 
chimps, and gorillas because the differ- 
ent lines of data gave conflicting answers 
(Bradley, 2008). So physical similarities 
that were once promoted as excellent 
evidence of common descent are now 
explained as having arisen separately. 
Ways of explaining other aberrant data 
have also been suggested (Som, 2015). 
Thus, the assumption of common 
ancestry remains unquestioned, while 
its presumed evidence keeps changing 
radically. 

Other claims of evidence support- 
ing common ancestry between humans 
and chimps were put forth a number of 
years ago by popular evolution advocate 
Kenneth Miller (2007). He noted that 
chimpanzees, like other great apes, 


have 24 pairs of chromosomes, while 
humans have 23 pairs. Evolutionists 
assume that two chromosomes fused 
together as we humans traversed a 
separate evolutionary path than our 
chimpanzee cousins (Fan et al., 2002). 
Confident in his conjecture and pre- 
mature conclusions about the clear 
evidence for the fusion, Miller willingly 
conceded, “If we don’t find [evidence 
for the fusion], evolution is wrong. 
We don’t share a common ancestor” 
(Miller, 2007). Contrary to Miller’s 
confidence, further investigation makes 
it clear that there is no evidence for a 
fusion (‘Tomkins and Bergman, 2011, 
Tomkins, 2013); ergo, we do not share 
a common ancestor. 

In this special CRSQ issue, Tomkins 
(2016) further investigates the human 
genome. He observes that humans have 
gene sequences that are remarkably 
different from that found in any other 
species where these genes appear. His 
in-depth look at these sequences not only 
highlights human uniqueness in relation 
to the animal kingdom but also shows 
how it directly contradicts universal com- 
mon ancestry. As expected, considerable 
storytelling has been used by evolution- 
ists in an attempt to accommodate the 
data, but the evidence is clearly more 
consistent with a biblical model, where 
humans were specifically created by 
God, and do not share a common evo- 
lutionary history with chimpanzees or 
any other organism. 

In addition, genetic evidence sup- 
ports a recent origin for humans. For 
example, human genealogies all over- 
lap in the recent past. This indicates all 
humans descended from an ancestor 
that lived only about 3,000 BC (Rohde 
et al., 2004). Also, most of the single 
nucleotide variation affecting protein- 
coding genes among humans arose in 
just the last few thousand years (Fu et 
al., 2013). 

Despite some uninformed criticism, 
there is also very good genetic evidence 
for the existence of an original human 


pair (i.e., Adam and Eve). In-depth 
analysis of this genetic data even shows 
it fits well in a few-thousand-year time 
frame (Jeanson, 2015). In this CRSQ is- 
sue, Carter and Lightner (2016) expand 
upon this concept. Exploring more of 
the DNA data, they describe genetic 
lineages that are consistent with the 
three daughters-in-law of Noah and the 
great population dispersion from Babel. 

All of this genetic data is consistent 
with a recent origin of humans, lack of 
universal common descent, and overall 
failure of Darwinism to climb Mount 
Improbable. This mountain has not, 
and cannot, be climbed by mutations, 
transposition, or even limited prepro- 
grammed systems. Evolutionists fre- 
quently offer examples of skids down the 
mountain (e.g., most mutations) or start 
at a higher point on the mountain by 
using modular systems (e.g., evolution 
2.0), but none are able to account for 
scaling even small sectors of the moun- 
tain. Ascension of Mount Improbable 
requires the direct action of a creator. 
‘This occurred once (Genesis 1) and has 
not been repeated. 

All the articles in this special issue 
provide a valid and dynamic description 
of the genetic basis of a creation model. 
Genetic data not known just a few years 
ago add to the failure of common de- 
scent and the vibrant explanatory power 
of biblical creation. There is clearly not 
a lack of genetic evidence for creation. 
Critics are generally either ignorant of 
this evidence or simply unwilling to 
grasp its significance. 
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Human Genetic Data Affirms 
Biblical History on Many Levels 
and Is an Excellent Resource 
for Creation-based Research 


Rohert W. Carter and Jean K. Lightner* 


Introduction 


The history presented in Genesis makes 
it clear than humans were created in 
God’s image, separately from all other 
animals (Genesis 1:20-27). Adam was 
created directly from the ground, and 
Eve was made from his side (Genesis 
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Abstract 


ome have claimed that modern genetic data is at odds with biblical 

history. Yet closer examination reveals that the opposite is true. In 
terms of the origin of humanity, genetic data support the fact that all 
humans alive today can trace their ancestry back to a single male and a 
single female. When evolutionary assumptions are discarded and actual 
observable mutation rates are used, the molecular clock indicates that 
those individuals lived within a biblical time frame. Analysis of the hu- 
man mitochondrial data reveals three major mitochondrial lineages, 
which appear to point to the three daughters-in-law of Noah. The Y 
chromosome distribution pattern supports a single paternally based 
dispersion as expected by the Babel event. Yet many questions remain, 
even as genetic data accumulate and computers make modeling more 
accessible to those outside the traditional university setting. The time 
is ripe for productive creationist research to answer important questions 
about the genetic history of humans using the wealth of data and tools 


now at our disposal. 


2:7, 21-22). As humans reproduced and 
filled the earth, the earth became filled 
with evil, so God chose to send a Flood 
to destroy the inhabitants (Genesis 
6:5-7). Noah, his wife, his three sons, 
and their wives were the only humans 
that survived the global cataclysm (Gen- 


esis 6:18; 7:7, 13; 8:16; 1 Peter 3:20). 
All humans alive today have descended 
from them. Biblical data (Genesis 5, 11) 
and secular history enable us to estimate 
the time of Creation around 6,200 years 
ago and the Flood around 4,600 years 
ago (Hardy and Carter, 2014). 

If this record is correct, it should 
be consistent with observations we can 
make today. Over the past several de- 
cades, an enormous amount of genomic 
data has been generated. This includes 
large-scale projects such as the HapMap 
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project and the 1000 Genomes Project 
(International HapMap 3 Consortium, 
2010; 1000 Genomes Project, 2015). 
While it is recognized that some errors 
are present in the data (Tomkins, 2011; 
Merchant et al., 2014; Carter, 2007), 
there should still be good agreement 
between the genomic data and the pre- 
dictions one can make based on biblical 
history. Indeed, this has been affirmed in 
creationist journals (e.g., Carter, 2009; 
Jeanson, 2015), in Protestant theologi- 
cal journals (e.g., Sanford and Carter, 
2014), and in a two-part article designed 
to reach out to Catholics (Sanford and 
Carter, 2015a, 201 5b). 

A major point of controversy in- 
volves the question of whether the 
currently observed human genetic 
variation is compatible with all humans 
descending from a single couple around 
6,000 years ago. Dr. Francis Collins (a 
prominent evangelical Christian in the 
world of science, the former director 
of the Human Genome Project, and 
the current director of NIH) has gone 
on record as stating, “There is no way 
you can develop this level of variation 
between us from one or two ancestors” 
(Adkisson, 2011). 

Similarly, Dennis Venema, Collins’s 
fellow at the theistic-evolution promot- 
ing organization Biologos, has said: 

You would have to postulate that 
there’s been this absolutely astro- 
nomical mutation rate that has 
produced all these new variants in 
an incredibly short period of time. 
Those types of mutation rates are just 
not possible. It would mutate us out 
of existence. (Haggerty, 2011) 

Are these claims correct? How 
would we know and what, exactly, does 
the Bible predict about human genet- 
ics? This paper discusses some initial 
considerations essential to consistently 
interpreting the genetic data within a 
biblical framework. It will also lay some 
groundwork on what has been done, and 
what needs to be done, to model human 
genetic history from a biblical perspec- 


tive. Such a model can help us under- 
stand our past (e.g., human migrations) 
and potentially may provide insights 
about human diversity as it relates to 
adaptation and disease. 


Designed Diversity 
All people on earth today have come 
about through the normal process of 
sexual reproduction. Gamete produc- 
tion in the mother and father created 
haploid versions of the parental genome 
through the process of meiosis. During 
this process, the complimentary copies 
of the parental autosomes recombined 
in large sections, gene conversion oc- 
curred, and mutations were introduced. 

Unlike everyone alive today, however, 

the genomes of Adam and Eve did not 

come about through natural processes. 

This is an incredibly important consider- 

ation for us and one that our opponents 

have rarely acknowledged. If Adam and 

Eve were specially created, we have 

multiple starting possibilities: 

e Adam and Eve had unique genomes, 
with two original copies of each 
autosome (this is a good starting as- 
sumption); or 

e Eve was a near clone of Adam, with 
the exception that she had no Y 
chromosome; or 

e Eve was a haploid clone of Adam, 
essentially a product of meiosis, but 
with doubled chromosomes; or 

e Adam and/or Eve were created with 
multiple genomes, possibly a dif- 
ferent haploid set of chromosomes 
in each of their reproductive cells, 
essentially limiting future human 
genetic diversity only by the number 
of children they could potentially 
have. 

Authors such as Collins and Venema 
are assuming there was no designed 
human diversity in Eden. According to 
that assumption, the four sets of chro- 
mosomes in Eden (two sets in Adam and 
two sets in Eve), would have all been 
identical. The only exception to this 


would have been the sex chromosomes 
(otherwise Adam and Eve would have 
necessarily both been female). This 
assumption is both unjustified and un- 
reasonable. There is no reason to think 
any two chromosomes in Eden would 
have been identical. Even as Eden must 
have had designed sexual diversity (male 
and female), every chromosome could 
have carried unique alleles. Thus, the 
antediluvian population could have had 
much more genetic diversity than is seen 
among people today. Even if Eve was a 
near clone of Adam, Adam could have 
himself been heterozygous at tens of mil- 
lions of nucleotide positions. Therefore, 
Venema’s statement above is couched in 
error. He assumes he is starting from a 
blank slate, essentially a couple contain- 
ing zero genetic diversity. 

The available data can help us make 
estimates of created diversity in Adam 
and Eve. Theoretically, one of four 
nucleotide “letters” must appear at any 
position in the genome (A, C, G, or T). 
But when examining any specific loca- 
tion, one person might have a different 
letter in that position than another. Most 
variation is biallelic (in other words, only 
two letters are found at that location 
among all the people on earth), and 
there are millions of variable positions 
of this nature in the human genome 
(International HapMap 3 Consortium). 
Thus, any two people will have millions 
of single-letter differences among them. 
Yet, these variable locations are largely 
shared among all people groups, imply- 
ing that this variation was established in 
the very early human population. From 
a biblical perspective, that means these 
variations had to predate the Babel dis- 
persion, when the human population 
became fragmented linguistically and 
geographically (Genesis 11:1-9). Most 
reasonably, the majority of this genetic 
diversity would have been present in 
Adam and Eve at Creation, which could 
easily mean 10-100 million or more 
positions were created heterozygous 


(Carter, 2011). 
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During meiosis, homologous recom- 
bination shuffles the alleles (variants) 
between chromosomes. This occurs via 
crossing over and gene conversion. Since 
there are usually only one or two cross- 
overs that occur per chromosome arm, 
large sections of DNA remain together 
on a chromosome as it is passed on. 
These regions are known as haplotype 
blocks and are recognized by a particular 
combination of alleles. Over many gen- 
erations these regions should become 
more scrambled, shuffling the alleles, 
and resulting in haplotype blocks that 
are considerably smaller. Gabriel et al. 
(2002) estimate that most of the genome 
is contained in haplotype blocks of sub- 
stantial size. The specific haplotypes and 
their boundaries were frequently shared 
across different populations of humans. 
All this is consistent with the population 
bottleneck at the Flood followed by a 
dispersion following the Babel incident 
several thousand years ago. 


The sequence of the genome can be 
changed by mutation. This could be 
a single nucleotide change, such as a 
transition from C to T’ Alternatively, mu- 
tations may result in structural changes 
such as the duplication or deletion of 
a region, producing what is known as 
copy number variants (CNVs). It is now 
recognized that CNVs are very common 
sources of variation between humans. 
While some have no known effect, oth- 
ers are associated with adaptation or 
disease (Zarrei et al., 2015). Other struc- 
tural rearrangements, such as inversions, 
can occur as well (Sudmant et al., 2015). 

While there are even more alleles 
present in the human population that 
are attributable to mutation, most 
are not as widespread. Any individual 
human carries mostly common vari- 
ants, which are likely created diversity, 
and fewer population-specific or even 
“private” alleles, which should largely 
be attributable to mutation (but see 


caveat under “The Effects of the Flood 
Bottleneck”). Detailed analysis of pat- 
terns in these alleles is important for 
understanding human genetic history 
as well as factors influencing adapta- 
tion and disease. Genetic variants that 
are widespread must have arisen early 
in human history; genetic variants that 
are very rare are much mote likely to be 
“young” mutations. Interestingly, recent 
analyses by evolutionists have revealed 
that most protein-coding variants appear 
to be of very recent origin (‘Tennessen et 
al., 2012; Fu et al., 2013). Again, even 
though evolutionary assumptions were 
used in the estimates, the findings are 
consistent with the biblical historical 
parameters. 


Historic Population Sizes 
Speaking of the acceptable ranges of 
biblical parameters, historic population 
sizes are also important for us to con- 
sider. The size of a population dictates 
how much diversity it can hold, for 
small populations are subject to genetic 
drift: random sampling of the gene pool 
each generation can lead to significant 
changes in allele frequency in small 
populations. Genetic drift slows to a 
crawl in populations numbering in the 
thousands, and is essentially nonexistent 
in large populations. 

Carter and Hardy (2015) used 
computer simulations to estimate the 
population sizes before the Flood, be- 
tween the Flood and Babel, and within 
the nation of Israel during their sojourn 
in Egypt. The latter has been a frequent 
target of attack by skeptics who claim it 
is impossible for the Israelites to have 
attained the population size suggested in 
the Bible (Exodus 12:37-38; Numbers 
1:46). On the contrary, simulations with 
some parameters indicate that attaining 
a population size of 2.7 million was pos- 
sible within 215 years. If the Israelites 
were in Egypt longer, as many believe 
the Bible teaches, reaching such a popu- 
lation size was a trivial matter. 


In contrast to the Exodus event, we 
do not have any biblical data that would 
allow us to estimate the population size 
at the Flood or at Babel. However, large 
population sizes at these events, and 
rapid reestablishment of large popula- 
tions after each event, would have been 
relatively easy, given realistic population 
growth parameters. So, like the designed 
diversity example above, when we con- 
sider the relevant biblical parameters, 
there is no difficulty establishing appro- 
priate population sizes in the given time. 
We are not limited to any particular 
population size, and thus the biblical 
model can handle data that demand 
either large or small historic population 
sizes. In other words, we have far more 
flexibility than many of our antagonists 
appear to assume. 


The Effects of 
the Flood Bottleneck 

Carter and Powell (2016) showed that 
the biblical claim that the entire hu- 
man population was reduced to three 
reproducing couples is not problematic. 
There are multiple scenarios (assum- 
ing rapid population growth) in which 
almost no created diversity would be 
lost due to genetic drift. There are other 
scenarios (those with very slow growth, 
or if Noah’s family were a small sample 
of the antediluvian population) where 
genetic drift would have been extreme. 
In high-drift scenarios, initial allele fre- 
quencies can rapidly change from 50:50 
(the distribution they assumed in Adam 
and Eve) to extremely high/low allele 
frequencies. In these cases, a great deal 
of allelic fixation/extinction can occur, 
resulting in extensive loss of the initial 
allelic diversity. Intermediate levels of 
drift would result in partial loss of allelic 
diversity and a limited number of low 
frequency alleles (that are not derived 
by mutation). 

When Carter and Powell (2016) 
compared their models to the real-world 
genetic diversity found among multiple 
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world populations, they concluded that 
modern humanity has experienced 
a large amount of genetic drift. This 
does not contradict the information in 
the paragraph above, but it does mean 
that of all the possible genetic history 
models, those with a strong bottleneck 
effect are more likely to reflect biblical 
human history. But when comparing 
Europeans to East Asians to Africans, 
they also saw that the allele frequency 
in one population was a strong predic- 
tor of the allele frequency in the other 
populations. In other words, the allele 
frequency spectrum was set up prior to 
Babel. Genetic drift must have occurred 
between Adam and Noah. 


Mitochondrial DNA 
and Y chromosomes 


Interestingly, it was the evolutionists 
who uncovered genetic evidence of a 
single woman (Mitochondrial Eve) and 
a single man (Y chromosome Adam) 
founding the human race. They also 
uncovered evidence of a severe popula- 
tion bottleneck, from which they con- 
struct their out-of-Africa model (Carter, 
2010). These genetic situations are more 
consistent with a creation model than 
with evolution. 

Mitochondria are organelles found 
in the cytoplasm of cells. They have 
some of their own DNA, which is sepa- 
rate from nuclear DNA yet considered 
part of the genome (all DNA of an or- 
ganism). Mitochondrial DNA is passed 
down from mother to child, apparently 
with no contribution from the father. 
Based on differences in the sequence 
between people, it is clear we all could 
have come from one individual female, 
often called “Mitochondrial Eve.” 

Evolutionists place a time frame 
of when “Mitochondrial Eve” lived by 
assuming common ancestry between 
humans and chimps and the evolution- 
ary timescale. However, when measured 
mutation rates in mitochondrial DNA 
were used, “Eve” was calculated to have 


lived around 6,000 years ago. Of course 
the evolutionists do not accept this time 
frame, so they have sought ways around 
the implications (Gibbons, 1998; Jazin 
et al., 1998). More recent in-depth 
analysis of mitochondrial DNA has up- 
held this biblical time frame for humans 
and found the same pattern in other 
organisms as well (Jeanson, 2014, 2015). 

The out-of-Africa model was pro- 
posed by evolutionists to address the fact 
that patterns of genetic variability sug- 
gesta bottleneck occurred in the human 
lineage, and patterns of mitochondrial 
DNA variability across various popula- 
tions suggested it may have originated 
from Africa (Cann et al., 1987). While 
various studies occasionally produce 
conflicting results, this is still the most 
popular evolutionary model of human 
history, partially because there is so 
much genetic diversity among Africans. 
The time frame and area from which 
humans dispersed differ from the Bible, 
but there are three major mitochondrial 
lineages that have been recognized (Wi- 
tas and Zawicki, 2004; also see Figure 
1). Carter (2009) has pointed out that 
there are other possible reasons for high 
genetic diversity in Africans, and (2010) 
that there is a more plausible ancestral 
sequence than the one proposed by 
evolutionists (Figure 2). 

The human Y chromosome is re- 
markably similar among all humans, 
and the mutation rate is so slow it is dif 
ficult to detect (Jobling and Tyler-Smith, 
2003). This is consistent with the biblical 
account, where Noah would have passed 
his Y chromosome on to his three sons 
less than 5,000 years ago. Yet, the chim- 
panzee Y chromosome is radically differ- 
ent from the human Y, which is a chal- 
lenge for evolutionists to explain even in 
their extended time frame (Hughes etal., 
2010). If humans and chimps had a com- 
mon ancestor several million years ago, 
evolutionists are forced to propose that 
the Y chromosome mutated incredibly 
fast. But if all human males have very 
similar Y chromosomes (and they do), 


Y-chromosome Adam must have lived 
a very short time ago. Either way this is 
not consistent with evolutionary predic- 
tions. In contrast, this fits well with the 
biblical history of humans being created 
separately from all other animals. 
Interestingly, global patterns in the 
Y chromosome suggest a less complex 
migration pattern than for mitochondrial 
DNA. It has been suggested that men 
generally have their families closer to 
their place of birth, and women leave 
their families to follow the men (Jobling 
and Tyler-Smith, 2003). This pattern is 
also consistent with the Babel dispersion, 
where families were spread accord- 
ing to identity of the fathers (Genesis 
10:1-11:6), and so we would expect the 
mothers to be spread among the men. 


Summary 

The human genetic data is remarkably 
consistent with the biblical history. 
There is evidence that all humans trace 
their ancestry back to a single male 
and female, Adam and Eve. Genetic 
evidence points to a severe bottleneck, 
a dramatic decrease in population size, 
as we would expect from the Flood. 
Outside of Africa, there are three major 
lineages of mitochondrial DNA that 
would correspond to Noah’s three 
daughters-in-law; yet there is a single 
worldwide lineage of Y chromosome 
that came from Noah through his three 
sons. Inside of Africa, the rarest sequenc- 
es are also the most deviant. In other 
words, the out-of-Africa theory is based 
on statistical outliers! There is evidence 
of a single dispersion by families accord- 
ing to paternity, which corresponds well 
to the Babel event. When evolutionary 
assumptions are dropped and actual 
mutation rates are used, these events are 
within the biblical time frame. 

Yet there is much information the 
Bible does not directly tells us, even 
while it does set limits for possible bib- 
lical models of human genetic history. 
For example, in Carter and Powell’s 
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Figure 1. The evolutionary map of world migrations based on mitochondrial DNA has some striking similarities to predic- 
tions based on the biblical history. The out-of-Africa theory tells of a single dispersal of people, centered near and travelling 
through the Middle East, in the recent past. This type of pattern, with the migration originating in the Middle East, is 
predicted based on the history surrounding the Tower of Babel. Map from mitomap.org (http:/Avww.mitomap.org/pub/ 
MITOMAP/MitomapFigures/WorldMigrations2013.pdf). 





model, the data forced them to conclude 
that either the antediluvian population 
was small or Noah and his wife and/or 
daughters-in-law were closely related. Is 
it unfair of us to appeal to a limited set 
of explanatory models when trying to 
fit the data to biblical history? Hardly, 
for this is exactly how the out-of-Africa 
theory developed (Carter, 2009), and it 
is still common practice among evolu- 
tionists today (Henn et al., 2016). Not 
only that, but most students of Creation 


and the Flood also have assumed the 
Flood bottleneck would involve a high 
degree of inbreeding, with possible loss 
of original diversity. This is especially 
true since Wieland’s provocative 1994 


article on the subject (Wieland, 1994). 


The inbreeding we might expect during 
the Flood/Babel period would produce 
exactly the allele frequency spectrum we 
see among modern people today. 
Much of the discussion above could 
not have been part of any serious analysis 


of biblical history prior to just several 
years ago. The main reason for this has 
been the rise of powerful computers. 
With the rise of cloud computing, indi- 
viduals now have inexpensive access to 
high-level computing resources once 
reserved for universities and govern- 
ments. We would like to appeal to oth- 
ers interested in these subjects to build 
their own computer models. There are 
many questions remaining, and much 
refinement to existing conclusions can 
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Figure 2. A diagram showing the relationship among major mitochondrial lineages. Evolutionists root the “tree” in Africa, 
but the sequences from there are statistical outliers. Carter (2007) placed the root at R, based on the most common nucleo- 
tides in each position across different human populations, but finding the location of the real root is a matter of statistics 


and historical uncertainties. Diagram from mitomap.org (http:/Avww.mitomap.org/pub/MITOMAP/MitomapFigures/ 


simple-tree-mitomap-2012.pdf). 


be done. For example, if Neanderthals 
are human, how can we account for the 
presence of such genetically distinct 
humans that early in post-Flood his- 
tory? And if Neanderthals interbred with 
humans early in modern human history 
(Kuhlwilm et al., 2016), what does this 
mean for the out-ofAfrica theory since 
Neanderthals were supposedly not part 
of the bottleneck that led to the origin 
of “Homo sapiens?” And if sub-Saharan 
Africans came out of Babel, why do they 
display higher levels of genetic diversity 


than the rest of the world put together? 
These are fascinating questions, and as 
of right now they seem to be answered 
only by evolutionists. Creationists need 
to continue to develop competing robust 
models. 

Robust creation models serve a put- 
pose beyond just satisfying our curios- 
ity about our history. A robust creation 
model that fits the data well can be used 
to make predictions, further test between 
the biblical history and the evolutionary 
one, and possibly give us valuable in- 


sights that relate to questions about adap- 
tation and disease. These models would 
also help effectively counter challenges 
frequently leveled at biblical Creation. 
There is a tremendous opportunity for 
creation research in this area. 
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Introduction 


At the most fundamental level of ob- 
jective discernment, even a child can 
clearly tell the difference between a 
human and a chimpanzee. However, 
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Abstract 
a bes Bible clearly states that humans were created in the image of 
God (Genesis 1:26-27). This makes us distinct in certain ways 
from the rest of the creatures God created, including primates. In addi- 
tion to obvious outwardly visible trait differences, it would make sense 
that we would find certain regions of the genome that are distinctly 
different between humans and other animals, and this is in fact seen. 
Secularists postulate that these genetic differences arose from acceler- 
ated evolution since the time that humans allegedly diverged from 
apes; thus they call these regions human accelerated regions (HARs). 
HARs are exceedingly problematic for evolutionists due to the fact that 
they tend to be highly conserved across vertebrates but are markedly 
different in humans. However, within supposed vertebrate lineages, 
many of these regions are taxonomically isolated—they seem to arise 
suddenly—with no evolutionary history. A new phylogenetic analysis of 
105 HAR genes in 10 different vertebrate taxa show that these sequences 
also display remarkable phylogenic discordance on a broad scale. This 
is inconsistent with the idea that these genes were generally conserved 
for tens or hundreds of millions of years but then suddenly evolved into 
taxonomically restricted forms. The data is more consistent with the 
creation model, wherein the genes that encode taxonomic distinction 


were custom designed. 


the secular idea that humans somehow — whole evolutionary paradigm because 
the Bible not only indicates that God 


made each creature “after its kind” but 


evolved from apes has become a lead- 
ing icon of the evolutionary paradigm. 
In a creationist sense, this is one of the —_ also that humans were uniquely created 
most objectionable components of the —_in God’s image. 

While many creatures exhibit dis- 
tinct genetic differences, the issue of 
human relatedness to apes is seemingly 


bolstered in an evolutionary sense by 
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regions of high DNA similarity between 
humans and great apes (chimpanzees, 
gorillas, and orangutans), although 
pervasive inconsistencies, which evolu- 
tionists attribute to incomplete lineage 
sorting, negate a clear path of common 
ancestry (De et al., 2009; Ebersberger 
et al., 2007; Hobolth et al., 2007; Pat- 
terson et al., 2006). In addition, the 
DNA similarity paradigm, particularly 
in regard to human and chimpanzee 
DNA similarity, tends to be dominated 
by studies utilizing selective data that 
excludes genomic regions that are dis- 
similar (Bergman and ‘Tomkins, 2012; 
Tomkins and Bergman, 2012). 

Another problem in comparing hu- 
man and ape DNA sequence is that great 
ape genomes, including chimpanzee, 
are computationally assembled from 
small individual sequence reads using 
the human genome as a reference 
sequence, and thus they appear to be 
more humanlike than they really are 
(Chimpanzee Genome Sequencing 
Consortium, 2005; Prado-Martinez et 
al., 2013; Tomkins, 2011). This problem 
is compounded even further by the fact 
that the chimpanzee genome is largely 
still a rough draft with numerous un- 
sequenced gaps. In fact, a large number 
of studies, based on flow cytometry of 
nuclei and cytogenetic analyses of band- 
ing patterns, estimate that on average 
the chimpanzee genome is about 8% 
larger than human with significantly 
more heterochromatic DNA (Formenti 
et al., 1983; Koop et al., 1986; Pellic- 
ciari et al., 1982; Pellicciari et al., 1988; 
Pellicciari et al., 1990a; Pellicciari et al., 
1990b; Seuanez et al., 1977). At present, 
it appears the alignable regions of the 
human and chimpanzee genomes are 
on average about 88% similar (‘Tomkins, 
2015b). Nevertheless, there are many 
regions of apparent similarity between 
the genomes that are about 98% identi- 
cal. It is these regions that are typically 
compared by evolutionists because they 
are conducive to hypothetical analyses 
regarding selection. 


One of the features of the human 
genome that has been of particular 
interest to evolutionists during the past 
decade is termed human accelerated 
regions (HARs). These regions are a 
double-edged sword for the evolutionary 
paradigm in that they are both highly 
conserved (similar across taxa) yet mark- 
edly different in humans compared to 
other animals (particularly chimpan- 
zees). Therefore, there is interest in 
finding such sequences and functionally 
characterizing them, as such sequences 
may help us understand what makes us 
uniquely human. 

The detection of alleged accelerated 
regions of evolution assumes that evolu- 
tion on a grand scale has actually oc- 
curred and requires a significant amount 
of hypothetical modeling. Under this 
assumption, DNA substitution rates are 
estimated based upon highly similar ge- 
nomic regions from humans, great apes, 
and other vertebrates. These regions are 
so similar that they generally do not con- 
tain many sequence gaps (insertions or 
deletions) between taxa. In other words, 
the differences are primarily in single 
bases, called substitutions. 


Early Discoveries of HAR 
and the Enigma of HAR1 
The first popularized discovery of a HAR 
(demarcated HARI) was a 118 base pair 
(bp) region that showed 18 base substi- 
tutions compared to its counterpart in 
the chimpanzee genome (Pollard et al., 
2006a). When this region was assessed 
for variability among humans, it was 
found to be fixed in human populations 
(nonvariable). Making this discovery 
even more remarkable was that when 
the same genomic segment from chim- 
panzee and chicken were compared, 
there was only a 2 base difference out 
of the 118 bases. In the evolutionary 
mindset, the region clearly was highly 
conserved across taxa, but why was 
it so different between humans and 
chimpanzees? Hence the name human 


accelerated region is based on the evolu- 
tionary belief that it must have changed 

very rapidly after humans diverged from 

chimps. While scientists found the data 

to be especially intriguing, the results 

defied the evolutionary paradigm of slow 

and gradual evolution of the genome. 

Even more intriguing was the fact 
that homologs for HARI could not 
be found in frog or any fish genome 
(Pollard et al., 2006a). Therefore, 
since it was functional and present 
in a chicken-like common ancestor 
(presumably about 310 million years 
ago according to evolutionary theory), 
then it originated “suddenly” on the 
evolutionary scene in some vertebrate 
ancestor about 400 million years ago. 
In this light, the HARI sequence ap- 
peared suddenly in vertebrates with no 
evolutionary precursor. 

However, the homology mystery 
does not stop here. As it turns out, the 
HARI region is a part of an overlapping 
set of genes called HARIA and HARIB 
(previously referred to as HARIF and 
HARIR). Whereas the HARI region 
itself is highly conserved across taxa, 
even in chickens, the larger gene 
region of which it is only a small part 
is highly nonconserved and is very dif- 
ferent between vertebrate taxa. So how 
could one small isolated segment in this 
region stay relatively the same during 
millions of years of evolution, while the 
surrounding region that it is intimately 
connected with changed so markedly? 
In fact, even in rhesus (a monkey) most 
of the entire HARIA/B gene region of 
approximately 9,000 bases is almost 
completely unalignable to human (Pol- 
lard et al., 2006a). 

The HARIA and HARIB genes pro- 
duce noncoding RNAs and are expressed 
in the developing neocortex (Pollard et 
al., 2006b). As it turns out, the 18-base 
difference between the human and 
chimpanzee versions of the HARI] gene 
lead to remarkably distinctive secondary 
structures, as shown in Figure | and de- 
scribed in detail as a result of a thorough 
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Figure 1. The secondary structures for the human HARI RNA (A) and the chimpanzee HAR] (B). Notice that the differ- 
ence in sequence results in a molecule with a significantly different shape. Based on their assumption of universal common 
ancestry, evolutionists believe this region of the genome underwent rapid evolution after humans split from chimps. Figure 


was adapted from Beniaminoy, et al., 2008. 


biochemical investigation performed 
several years after the original discovery 
of the gene (Beniaminov et al., 2008). 
The chimpanzee HARI RNA adopts a 
long hairpin structure, while the human 


HARI RNA forms a completely different 
cloverleaf structure. ‘These dramatically 
different configurations are clearly as- 
sociated with taxonomic specificity and 


function. 


Other HAR Discoveries 


At about the same time the discovery 
of HARI was being announced, several 
other reports were published describ- 
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ing larger genome-wide investigations 
of accelerated noncoding regions in 
humans and other vertebrates (Pollard 
et al., 2006a; Prabhakar et al., 2006). 
Prabhakar et al. (2006) compared the 
conserved noncoding regions (CNS) in 
humans, chimpanzees, and mice with 
the result that none of the overall pat- 
terns across lineages conformed to the 
grand evolutionary paradigm (inferred 
evolutionary trees). They also found that 
the CNS regions were heavily enriched 
near neuronal cell adhesion genes (cad- 
herins, protocadherins, contactins, and 
neurologins) in chimpanzees and hu- 
mans, but they were not in mice—a clear 
anomaly for the overall mammalian 
evolutionary model. Furthermore, dispa- 
rate evolutionary results were obtained 
between humans and chimpanzees, to 
which the authors responded, “This sug- 
gests independent accelerated evolution 
of neuronal cell adhesion functions in 
the human and chimpanzee lineages” 
and “It is unlikely that acceleration of 
neuronal adhesion CNSs in humans 
and chimpanzees resulted in the same 
neuronal phenotypes, because the CNSs 
accelerated in the two lineages are 
largely disjoint and would therefore have 
had different consequences for brain 
development and cognitive function.” In 
the end, they finally identified 992 CNS 
regions that were human-specific and at- 
tributed to advanced neural capabilities 
in humans versus other primates. 

In the study by Pollard et al. (2006a), 
the researchers focused specifically 
on 202 carefully selected candidate 
regions they claimed had been under 
strong negative selection, which is de- 
fined as the removal of alleles that are 
deleterious (also referred to as purifying 
selection). However, it should be noted 
that selection is not actually observed 
in cases like this but merely inferred 
based on the variability of the compared 
sequences in question. These regions are 
essentially nonvariant in humans but are 
significantly different from their counter- 
parts in chimpanzees. Thus, itis thought 


they evolved quickly and then became 
indispensable to the human lineage 
and further evolution was shut down 
(constrained) in these regions due to the 
newly acquired functional importance 
of the sequence. This is essentially the 
mindset of the evolutionist in evaluating 
such sequences in a comparative sense. 
The closest genes to these CNS re- 
gions in the Pollard et al. (2006a) study 
were enriched for transcription factors, 
DNA-binding proteins, and regulators 
of nucleic acid metabolism; they were 
shown to be statistically correlated with 
high levels of association to cellular 
processes involved with development, 
neurogenesis, and morphogenesis. 


Functionality of HARs 

To help determine the functionality of 
HAR sequences, a recent study compiled 
a comprehensive list of 2,649 noncod- 
ing HARs, combining data from over 
five studies (Capra, et al., 2013). They 
then determined functionality for these 
regions using data from the ENCODE 
project for transcription factor bind- 
ing, histone modifications, and other 
indicators of chromatin state. They also 
analyzed positional data to determine 
the genomic landscape in which these 
sequences were situated. Using this com- 
binatorial data, they found that at least 
30% were clearly functional enhancer 
elements, with more than half (~60%) of 
the elements showing enhancer activity 
in at least one type of cellular context. 
Thus, well over half of these types of se- 
quences appear to function as enhancers. 

Enhancers are short 50 to 1500 bp 
regions that bind with transcription fac- 
tors to activate transcription of a gene 
(Capra, et al., 2013). They are generally 
cis-acting, and can be located up to 1 
million bp away from a gene that they 
regulate, upstream or downstream from 
the gene’s start site and in either the plus 
or minus strand orientation. Over 40,000 
enhancers have been catalogued in the 
human genome, and many are related 


to developmental processes (Andersson 
etal., 2014). Enhancer HARs have been 
found to be enriched in both intergenic 
regions across the genome and intra- 
genic regions inside introns (Capra et 
al., 2013). The HARs in the Capra et 
al. (2013) study were on average 257 bp 
long, and most were within 1 Mb of a 
known gene, with 19% of these genes en- 
coding transcription factor binding sites. 
So clearly these are important regulatory 
sequences in the overall scheme of gene 
and genome regulation. 

Interestingly, the researchers of the 
Capra et al. (2013) study also tested a 
small number of enhancers from both 
human and chimpanzees in transgenic 
mice. While this effort was not exhaus- 
tive in scope, a significant number 
of enhancers from both human and 
chimpanzee drove markedly differ- 
ent expression patterns in developing 
mouse embryos, indicating significant 
differences in functionality. At pres- 
ent, ten different HAR sequences have 
been tested in functional assays such as 
this in a variety of studies. Most were 
implicated in brain development, while 
two enhancer HARs were implicated in 
limb and eye development (Kamm etal., 
2013a; Kamm et al., 2013b; Lindblad- 
Toh et al., 2011; Pollard and Franchini, 
2015; Rossant, 2015; Sumiyama and 
Saitou, 2011). Of course, a limitation 
for studies like this (testing foreign 
constructs in transgenic mice) is that 
they cannot truly recapitulate the true 
function of a human or chimpanzee 
DNA regulatory element- they can only 
show how differences in the sequence 
produce different functional outcomes, 
and in some cases, what types of tissues 
their expression may correspond with 
(Pollard and Franchini, 2015). 

The developmental process itself is 
orchestrated through complex regula- 
tory networks that are tightly regulated 
and highly constrained (Davidson and 
Erwin, 2006). All types of DNA se- 
quences, both developmental genes 
(e.g., transcription factors) and regula- 
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tory sequences (like HARs) play major 
roles in development. ‘Transcription 
factor genes are highly pleiotropic. In 
other words, they participate in multiple 
independent processes, both spatially 
and temporally. In contrast, noncoding 
regulatory sequences, such as enhanc- 
ers, tend to function in a more limited 
number of cell types and processes. They 
also tend to operate more in an additive 
manner— combining together to control 
the complex expression patterns of de- 
velopmental genes such as transcription 
factors (Noonan and McCallion, 2010). 
Evolutionists seem to think this 
highly efficient, yet complex system 
of regulatory and developmental gene 
modules is somehow conducive to evolu- 
tion (Carroll, 2008), despite the fact that 
the evolutionary model cannot account 
for their origin and disrupting these se- 
quences often leads to serious problems, 
including catastrophic system failure. 
The most obvious and parsimonious 
explanation is that this type of complex 
modularity in code is analogous to 
human-engineered computer software 
that is both modular and often object ori- 
ented in its construction, where methods 
(functions) can be called in an additive 
fashion to instances of an object, thereby 
controlling and altering its output in 
the overall program. The ingenious 
design patterns in the genome are truly 
spectacular, but the significance of the 
implications are generally missed by 
those with the mindset of an evolutionist 
entrenched in naturalistic thinking. 


Deleted Accelerated 

Regions in Humans? 
Not only are the presence of HARs an 
enigma for the evolutionary paradigm, 
but so is the absence of such regions 
when comparing taxa. One must keep 
in mind that within the evolutionary 
mindset, these regions are allegedly 
under strong selective constraint and 
thus differ very little in their sequence 
between taxa. Thus, their sudden “disap- 


pearance” from a genome in the grand 
evolutionary tree of life is difficult to 
account for. 

In a large genome-wide survey for 
highly conserved sequences absent in 
human but present in chimpanzee and 
other mammals, researchers found 510 
such sequences, all of which (except 
for one) mapped to noncoding regions 
of the human genome (McLean et al., 
2011). Several of these allegedly deleted 
regions in humans corresponded to ap- 
parent enhancer elements present in 
the genomes of other mammals. The 
conserved chimp and mouse elements, 
along with deletions of them, were tested 
in transgenic mice. It was found that in 
transgenic mouse embryos, one of the 
deletions removed sensory vibrissae 
(tactile hair on the head, e.g., whiskers) 
and a penile spine enhancer element 
from a homolog to the human androgen 
receptor gene. The alleged deletion of 
this element in humans is quite large 
and corresponds to about 60,000 base 
pairs. Another supposed deletion was 
found to correspond to the removal of 
a forebrain subventricular zone (paired 
brain structure situated throughout the 
lateral walls of the lateral ventricles) 
enhancer element in transgenic mice. 

This original study of these two 
specific highly conserved enhancer ele- 
ments (present in other mammals but 
mysteriously missing in humans) were 
followed up several years later in another 
study (Reno etal., 2013). Using a combi- 
nation of large-scale database sequence 
analyses and direct DNA analysis of 
the genomes in question, researchers 
demonstrated that the penile spine/ 
vibrissa enhancer element was missing 
in all human genomes surveyed, and 
also in the archaic human genomes of 
Neandertal and Denisovan, but present 
in DNA samples of chimpanzees and the 
other great apes and other primates that 
exhibit some form of penile spine and 
facial vibrissae. ‘The other 508 conserved 
elements supposedly deleted during evo- 
lution ina common ancestor of humans 


and chimps remain to be functionally 
characterized. 

Another major evolutionary anomaly 
with overall patterns of these conserved 
noncoding elements in regard to their 
alleged mysterious deletion in major 
animal lineages is that the patterns 
are erratic and the supposed sudden 
absence of these elements are said to rep- 
resent “independent losses” and are “not 
uniform” (Hiller et al., 2012). In other 
words, they do not form consistent evo- 
lutionary trees regarding their presence 
and absence across lineages. Hiller et al. 
(2012) explained the majority of these 
aberrant patterns by claiming that many 
of the lost elements were slightly less 
evolutionarily constrained and shorter 
and thus must have been less pleiotropic. 
Enhancer elements for the most part do 
appear to be less pleiotropic on average 
than protein-coding developmental 
genes (Carroll, 2008; Wray, 2007). But 
this is not really a satisfactory reason for 
their erratic presence or absence across 
major lineages, given their functional 
importance and the alleged evolutionary 
constraint ascribed to them. 


Materials and Methods 
To supplement the literature review in 
this report and to fill in a glaring gap 
within the HAR research community, 
the phylogenetic analysis of 105 differ- 
ent HAR sequences was undertaken for 
the following taxa: human, chimpanzee, 
gorilla, orangutan, macaque, mouse, 
elephant, cow, and chicken. The ap- 
proach to acquiring the data was as fol- 
lows: (1) Using url links at <docpollard. 
com/HARs.html>, each individual HAR 
sequence was followed to its respective 
hg17 “Vertebrate Multiz Alignment & 
Conservation” view at the UCSC ge- 
nome browser (http://genome.ucsc.edu). 
(2) I then went to “View” then “Other 
genomes (Convert)’and used the more 
current hg19 version for my alignment 
data (adjusting the browser view for the 
species listed above). (3) I clicked on the 
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alignment link for each respective HAR 
gene and downloaded the subsequent 
alignment view as a plain text file. (4) I 
processed each downloaded text file with 
a Python script I wrote that converted it 
into standard FASTA file format. 

The phylogenetic analysis pipeline 
was performed as follows: (1) The MUS- 
CLE v3.8.3] program (Edgar, 2004) 
was used on the set of 105 FASTA files 
produced as described above, yielding 
multiple DNA alignment output files 
in FASTA format (MUSCLE default pa- 
rameters). (2) The MUSCLE program 
was used again to produce neighbor- 
joining trees (parameters: -maketree, 
-cluster neighbor joining). (3) These 
individual tree files were further text- 
processed and combined into a single 
multitree specialized Newick-style file 
required by the tree comparison pro- 
gram topd_v3.3.pl (Puigbo et al., 2007). 
Steps 1 through 3 were performed via 
a Python pipeline script written by this 
author with MUSCLE being employed 
as system calls within Python. The 
resulting Newick-style, multitree file 
processed with the same Python pipeline 
script was analyzed with topd_v3.3.pl 
but also evaluated for commonalities in 
topology by basic UNIX shell programs 
such as unig and grep, the latter was 
employed with a variety of different 
regular expressions for pattern matching. 
Phylogenetic trees, including those for 
this publication were drawn and printed 
to file using the Phylodendron Phyloge- 
netic tree printer program (http://Aubio. 
bio.indiana.edu/treeapp/treeprint-form. 
html). The two python scripts for pro- 
cessing the UCSC “Vertebrate Multiz 
Alignment & Conservation” text files 
and implementing the MUSCLE4tree 
pipeline, along with the FASTA files 
used in this study have been posted 
at github (https://github.com/jt-icr/ 
har_code.git). 


Results 
The results of the phylogeny analyses 
of the 105 HAR sequences tested were 


inconsistent with the grand evolutionary 
paradigm, in complete accordance with 
all of the other data discussed above. 
Based on analyses with the topd_v3.3.pl 
program, which exhaustively compares 
tree topologies to each other using a 
variety of algorithms, there was no evi- 
dence for a unified evolutionary tree in 
this data set. These trees did not support 
the inferred evolutionary phylogeny for 
the species tested. 

A sampling of the discordant trees is 
shown (incorporating genetic distance) 
in Figure 2. Most notable among these 
trees are those for HAR] and HAR2, 
several of the best-studied HAR genes 
that are also noted for their evolution- 
arily unsupportive sequences (alleged 
acceleration in humans compared to 
chimpanzees). For HARI, human and 
mouse cluster together in the same 
branch, as does elephant and chicken. 
The tree for HAR2 likewise is completely 
discordant with evolution, as human 
clusters with elephant. 

As a whole, the different HAR genes 
gave widely different topologies. This 
is frequently observed and is attributed 
to incomplete lineage sorting, a rescu- 
ing device used by evolutionists to 
explain incongruent data. This type of 
evolution-negating pattern has been a 
common finding of studies analyzing 
many different genes, genomics regions, 
or even protein sequences (Degnan and 
Rosenberg, 2009; Hobolth et al., 2011; 
Pisani et al., 2012; ‘Tomkins and Berg- 
man, 2013). Itappears the phylogenetic 
discordance for HAR sequences greatly 
exceeds that for other types of regions, 
such as protein-coding gene exons. 

Even when analyzing subtrees within 
the data set, humans and chimpanzees 
clustered together on the same branch 
only on 15 occasions (14% of the trees). 
Gorilla and human clustered together 
in only 8 instances, and orangutans, 
supposedly more distant to humans than 
gorillas, clustered directly with human 
on |] trees. Furthermore, a two-branch 
cluster with human and at least two apes 


“ 


(e.g. “[chimp, human] gorilla,” “[gorilla, 
human] chimp,” etc.) was only seen on 
13 occasions. In other words, if two great 
apes occupied a branch with humans, it 
was typical for the other to be located on 
a completely separate branch. 


Summary 


Human accelerated regions (HARs) 
are noncoding DNA sequences in the 
genome that, according to evolutionary 
reasoning, changed very little over the 
course of animal evolution but mysteri- 
ously and quite suddenly experienced 
a “burst” of change since the alleged 
divergence of humans from chimpanzee. 
These HAR-type sequences also appear 
suddenly in assumed vertebrate lineages 
with no prior evolutionary history, while 
others disappear and then reappear. 

In humans and several other mam- 
mals, many of these HARs are being 
functionally characterized as enhancer 
elements, developmental gene regula- 
tory elements, and even noncoding RNA 
genes. Many of them are also associated 
with a wide variety of important neu- 
rological traits unique to the humans. 
Evolutionists claim that the lack of varia- 
tion in these sequences among other 
animals is due to “conserved function.” 
Of course, very little is actually known 
about what these sequences are doing in 
the different kinds of animals in which 
they are found. In reality, we are just 
beginning to discover what they are do- 
ing even in humans. 

In addition to the alleged “accel- 
erated” evolution of these sequences 
within the human genome, this study 
shows that HAR genes show a pattern 
inconsistent with evolutionary predic- 
tions about the common ancestry 
of vertebrate lineages; this pattern is 
typically explained away as incomplete 
lineage sorting. The experimental data 
presented in this report shows that these 
alleged highly conserved sequences are 
discordant with classic evolutionary phy- 
logenetic analyses. ‘The analysis of 105 
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Figure 2. A selection of six different neighbor-joining trees from the 105 HAR phylogenies produced using MUSCLE 
program. Philodendron was used to draw the trees. Notice how inconsistent the results are, showing that the hypothesis of 
universal common ancestry is not supported by these genetic data. 
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different HAR genes from 10 different 
vertebrate taxa, including humans and 
the great apes, show extremely discor- 
dant evolutionary trees. 

So what can we make of all this 
evolutionarily incongruent data sur- 
rounding HARs? Cleary, the most par- 
simonious answer is that they represent 
designed functional mammalian genetic 
elements that encode the novel phe- 
notype of mankind. This is consistent 
with humans being uniquely created 
in the image of God, as clearly stated 
in the Bible. There is no evidence that 
these human-specific sequences evolved 
at any level or that they experienced 
a “burst of changes in humans since 
divergence from chimpanzees.” 

The standard explanation for HARs 
is also clearly falsified by recent research 
that has shown that for any mammalian 
species there is a profound waiting time 
problem associated with establishing 
new traits that require multiple new 
mutations (Sanford et al., 2015). Even 
establishing two codependent mutations 
in a hominin population is extremely 
problematic — requiring tens of millions 
of years. Since HAR genes are differ- 
ent at many nucleotide positions, the 
hypothesis that HAR genes arose very 
suddenly in just a few million years 
due to accelerated evolution is not even 
remotely credible. 

In Psalm 139:14, 16, it is stated: “I 
will praise thee; for I am fearfully and 
wonderfully made ... and in thy book all 
my members were written.” The Hebrew 
word for book is siphrah, which means a 
writing or document and by implication, 
a book, letter, or scroll. The Hebrew 
word for written is katab, which means 
to write, describe, inscribe, prescribe, 
or subscribe. We now know from the 
study of genetics and genomics that the 
genome is, in fact, a highly complex 
multidimensional document written in 
multiple codes and languages that we 
are now only beginning to understand 
(Tomkins, 2015a). Needless to say, this 
kind of handiwork far exceeds the abili- 


ties of even humans to engineer (or even 
fully understand). It clearly points to the 
Creator described in the Bible. 
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Adaptive Genetic Changes hy Design: 
A Look at the DNA Editing hy 
Activation-induced Cytidine Deaminase (AID) 


Jean Lightner* 


Abstract 
A ccording to evolutionary thinking, adaptive genetic changes are 


the result of random (non-purposeful) mutations and natural 
selection. While creationists do not need to account for the assumed 
changes that turn microbes into people, our model certainly points 
to a considerable amount of adaptive change that has occurred 
within created kinds. The naturalistic mechanisms proposed by 
evolutionists appear woefully inadequate to account for these. 
A look at the immune system reveals several different enzymes that are 
used to edit DNA; one of them is activation-induced cytidine deaminase 
(AID). AID is involved in gene conversion, somatic hypermutation, and 
class-switch recombination in B lymphocytes. While each of these begins 
with AID converting a cytosine residue to a uracil residue, the different 
outcomes are a function of different proteins being recruited to process 
the lesion. Since the activity of AID could be disastrous if not kept in its 
proper place, it is well regulated and tightly controlled at many levels. 
The well-designed DNA editing function of AID and other proteins 

in the immune system give reason to believe that adaptive alleles in 
various populations have similarly arisen by the providence of God, 
the Great Designer, and not by the naturalistic mechanisms proposed 


by evolutionists. 


Introduction 


According to popular evolutionary 
thinking, often referred to as neo- 
Darwinism or the modern evolutionary 
synthesis, the source of variation upon 
which natural selection supposedly acts 


arises from mutation. While the term 
“mutation” has changed over the last 
hundred years or so, it now generally 
refers to a change in the DNA sequence, 
primarily caused by unrepaired errors 
during replication (Mayr, 2001, pp. 
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96-98, 279-280). It is insisted that there 
is no teleology involved; that is, there 
is no design or purpose underlying 
mutations (Mayr, 2001, pp. 119-120, 
275). They are believed to arise by 
chance and be random with respect 
to the needs of the organism (Huxley, 
2010, p. 54). Natural selection is the 
mechanism given credit for rare adap- 
tive genetic changes becoming fixed 
in various populations of organisms 
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(Rensch, 1980, pp. 296-298; Mayr, 
2001, 119-120). 

Creationists approach the scientific 
evidence from a different vantage point. 
We reject the assumption of universal 
common ancestry, and instead recog- 
nize that creatures were created as vari- 
ous kinds (Genesis 1:1 1-12, 21, 24-25). 
Humans were created separately from 
all other animals (Genesis 1:26-28; 2:7, 
19-24). In the process of analyzing bio- 
logical data within a biblical framework, 
many have questioned two additional 
assumptions of neo-Darwinism: that 
mutations are always random errors 
without purpose, and that natural se- 
lection can really explain the almost 
magical transformations and adaptations 
in populations that evolutionists claim 
(Purdom and Anderson, 2009; Terborg, 
2008; Lightner, 2015). 

There is strong evidence that adap- 
tive genetic changes do occur. The 
biblical history describing the global 
Flood makes it clear that limited genetic 
variability was present immediately after 
that event. This is especially true of 
unclean land animals, where only two 
individuals survived the Flood on the 
ark, but even clean animals and birds 
on the ark had a drastically reduced 
population. Certainly, much more 
genetic and phenotypic variability is 
present today compared to the time 
of the Flood (Lightner, 2006, 2009a; 
Wayne and vonHoldt, 2012). In some 
groups, such as birds, there is astounding 
diversity that has arisen within created 
kinds (Lightner, 2010b). Particularly 
impressive are the adaptive radiations 
of birds inhabiting islands, such as the 
radiations of the vangas of Madagascar, 
the honeycreepers in Hawaii, and the 
finches in the Galdpagos (Reddy et al., 
2012; Jonsson et al., 2012; Lerner et al., 
2011; Lamichhaney et al., 2015). 

A recent creationist review of high 
altitude adaptation shows that adapta- 
tion is a complex, multilevel process that 
ranges from short-term physiological 
adjustments in the individual to new 


adaptive alleles in populations (Light- 
ner, 2014). Creationists have proposed 
various mechanisms for genomic change 
beyond the well-known shuffling that oc- 
curs during homologous recombination 
(i.e., crossing over and gene conversion 
in meiosis). Considerable attention has 
been given to transposable elements 
(Wood, 2002; ‘Terborg 2009a, 2009b; 
Shan, 2009). Many transposable ele- 
ments contain the genetic instructions 
for their own movement, and certain 
conditions seem to activate their move- 
ment (e.g., stress, hybridization). ‘They 
can change the sequence ofa gene, alter 
the regulation of one or more genes, 
and/or be involved in chromosomal 
rearrangements (Belyayev, 2014). 

While transposable elements certain- 
ly appear to play a role, many detailed 
studies of genetic differences underlying 
phenotypic diversity have yielded few ex- 
amples of where transposable elements 
appear to be involved in the genetic 
mutations that were identified (Light- 
ner 2008, 2009b, 2010a). Therefore, 
within the creation model, a reasonable 
prediction is that other mechanisms are 
involved in many of these DNA changes. 
It has been pointed out that DNA 
changes in B cells are a normal part of 
mounting an antibody response and that 
all the necessary biological information 
required to induce appropriate variation 
is coded in the genome (‘Terborg, 2009a). 
Therefore, DNA editing enzymes in- 
volved in the immune system are worth 
considering in more detail, and recent 
reviews highlight advances in our under- 
standing of the underlying mechanisms 
involved in the essential functions they 
perform (Zan and Casali, 2013; Kumar 
etal., 2014; Matthews et al., 2014; Moris 
et al., 2014; Chandra et al., 2015). 


DNA Editing and 
the Immune System 
‘Two enzymes known to be involved in 
DNA editing within the immune system 
are APOBEC3 (apolipoprotein BmRNA 


editing enzyme, catalytic polypeptide- 
like 3) and activation induced cytidine 
deaminase (AID). They belong to the 
AID/APOBEC family of DNA and RNA 
editing enzymes, which have important 
roles in a variety of important func- 
tions (Moris et al., 2014). This family 
is unique in possessing the ability to 
deaminate cytidine (C; the cytosine 
residue in RNA) or deoxycytidine (dC; 
in DNA) to uridine (U) or deoxyuridine 
(dU) (Figure 1). In many species, APO- 
BEC3 varies in copy number and is 
polymorphic; it restricts the replication 
of many exogenous viruses and endog- 
enous transposable elements (Harris 
and Dudley, 2015). At least some of 
this activity is based on its DNA editing 
ability, which essentially mutates the 
viral genome, destroying its ability to 
replicate. 

While the best-known roles of APO- 
BEC3 involve innate immunity, AID 
plays critical roles in various steps of 
adaptive immunity. Adaptive immu- 
nity involves an ingenious design that 
enables creatures to adapt to the chal- 
lenges in their environment. Organisms 
are constantly exposed to a myriad of 


NH2 | 

| SN H20 NH3 | NH 
+ —=T, 

i O 
(deoxy) (deoxy) 
ribose ribose 
(Deoxy) (Deoxy) 
Cytidine Uridine 


Figure 1. Activation-induced cytidine 
deaminase (AID) deaminates the cy- 
tosine residue deoxycytidine (dC) to 
deoxyuridine (dU) in DNA to enable 
adaptive immune responses. Other 
members of the AID/APOBEC family 
also catalyze this reaction in DNA or 
RNA for other purposes. 
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potentially harmful microorganisms, 
parasites, and toxins. They need to 
be able to identify them and properly 
dispose of them as necessary. Rather 
than being front-loaded with the exact 
code for every antibody to every possible 
antigen that could be encountered, the 
adaptive immune system is strategically 
designed to manufacture highly specific 
antibodies that can be used in several 
different contexts to effectively deal with 
potential pathogens. 

The portion of the genome used for 
antibody (immunoglobulin, Ig) forma- 
tion already contains some variability in 
many species. For example, in humans 
and mice, there are a number of different 
variable (V), joining (J), and diversity (D) 
regions coded on the DNA. Through 
V(D)J recombination, the recombina- 
tion activating gene enzymes (RAGI 
and RAG 2) initiate double-stranded 
breaks in the DNA that are repaired to 
bring a single V, D, and J segment in 
apposition with each other (Jung et al., 
2006). Other steps in Ig formation that 
involve DNA sequence modification are 
gene conversion, somatic hypermuta- 
tion (SHM), and class-switch recom- 
bination (CSR). Each of these three 
steps uses AID to initiate the genomic 
changes (Arakawa et al., 2004; Matthews 
et al., 2014). 

Gene conversion in lymphocytes 
was first described in the chicken, which 
has only one V region for both the light 
and heavy chain loci involved in Ig 
formation. However, it was found that 
there were numerous pseudogene V 
regions upstream that provide templates 
for intrachromosomal gene conver- 
sion, copying nucleotide tracts from 
the pseudogenes onto the V region to 
increase Ig diversity. In the rabbit, lym- 
phocytes also undergo gene conversion 
to increase variability; however, some of 
the upstream sequences are potentially 
functional. Interestingly, in both species, 
this process can be used to diversify the 
primary antibody repertoire, or further 
increase diversity in an antigen-specific 


immune response (Lanning and Knight, 
2015). 

Somatic hypermutation (SHM) 
involves the rapid introduction of 
mutations, primarily single nucleotide 
changes, into the complementarity 
determining regions (CDRs) of the re- 
combined V region. The CDRs code 
for the portion of the Ig molecule that 
contacts the antigen. SHM is best known 
for its role in an antigen-specific im- 
mune response, and there is a designed 
mechanism providing for the selection 
of B cells expressing Ig with the greatest 
affinity to the antigen. SHM enables the 
body to rapidly produce highly effective 
Ig to any conceivable antigen that is 
encountered from the more limited di- 
versity of the primary antibody repertoire 
(Matthews et al., 2014). 

Once an effective antibody has been 
produced, there is a need to use it in sev- 
eral different contexts to effectively deal 
with an infection. That is the purpose of 
class switch recombination (CSR); it is 
said to change the effector functions. To 
switch the class of Ig produced, a DNA 
segment needs to be excised to place the 
V(D)J region before an exon coding for 
a different constant (C) region (Figure 
2). There are switch (S) regions contain- 
ing repetitive DNA that precede most 
of the C regions, and these S regions 
are targeted by AID to induce several 
double-stranded breaks so the interven- 
ing region can be removed (Matthews 


et al., 2014). 


AID: Gene Structure 
and Regulation 
In humans, the enzyme AID is encoded 
by the AICDA gene on chromosome 
12. The gene spans 11 kb, consists of 5 
exons, and is primarily expressed in B 
cells. However, in mice it has also been 
detected in oocytes, embryonic germ 
cells, and embryonic stem cells; addi- 
tionally, it has been detected in normal 
human spermatocytes. Occasionally, 
AID has been associated with pathol- 


ogy, as it has sometimes been detected 
in cells affected by chronic inflamma- 
tion or cancer (though generally not 
testicular cancer). Since off-site activity 
of this DNA editing molecule can be 
disastrous, its expression is tightly con- 
trolled on multiple levels (Barreto and 
Magor, 2011). 

In addition to a promotor region 
known to bind four different transcrip- 
tion factors, several other regions are 
involved in controlling transcription of 
the gene. Intron | contains an enhancer/ 
silencer region, where two transcription 
factors bind to repress the gene, and two 
different transcription factors bind to 
de-repress it. A third downstream region 
binds a transcription factor to maintain 
physiologic levels of AID expression. A 
fourth enhancer region is located up- 
stream of the promotor. Recently, three 
more enhancers were identified further 
(up to 50 kb) from the aicda locus in 
mice (on chromosome 6; Chandra et 
al., 2015; Kumar et al., 2014). 

Once the gene is transcribed, stabil- 
ity of the mRNA is affected by two dif- 
ferent microRNA (miRNA) molecules. 
These miRNAs bind to the 3’ UTR, 
the untranslated region following the 
portion of mRNA specifying the amino 
acid sequence. One miRNA is down- 
regulated during B cell activation, while 
the other is up-regulated. It is not sur- 
prising, therefore, that experimentally 
induced mutation of the 3’ UTR of the 
AID mRNA resulted in spatiotemporal 
dysregulation of AID and off-site muta- 
tions (Chandra et al., 2015; Kumar et 
al., 2014). 


Enzyme Structure 
The enzyme is 198 amino acids long 
in humans and consists of a number 
of functional domains, some of which 
overlap (pleiotropy) and a few of which 
are not well characterized. Like other 
cytidine deaminases, the catalytic region 
of AID (amino acid positions 56-90) in- 
cludes two cysteines and a histidine that 
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Figure 2. In class switch recombination (CSR), a segment of DNA must be removed 
to place the V(D)J region next to a different constant region. Activation-induced 
cytidine deaminase (AID) deaminates cytosine residues to uracil residues in the 
two switch (S) regions flanking the segment to be removed. The base excision 
repair enzyme uracil DNA glycosylase (UNG) removes the abnormal base leav- 
ing an abasic site. Apurinic/apyrimidinic endonuclease (APE) nicks the DNA 
at the abasic site. These single-stranded breaks can be converted to staggered 
double-stranded breaks. Mismatch repair (MMR) enzymes and the MRN com- 
plex (Mre11/Rad50/Nbs1) process the staggered ends so the V(D)J region can be 
joined to the constant region. The intervening DNA is circularized and removed. 


coordinate a zinc ion to form the active 
site. Further downstream (113-123), 
though adjacent in the folded protein, is 
a critical hotspot recognition loop. This 
loop specifically targets a weak (W = 
A/T) nucleotide followed by a purine (R 
= A/G) in the 2’ and I’ positions relative 
to the dC to be directed into the active 
site (Barreto and Magor, 2011; Nabel 
etal., 2014). 

Interestingly, AID’s preference for a 
DNA substrate appears to be related to 


the nucleotide’s rotational conforma- 
tion, sometimes known as sugar pucker. 
While AID targets a WRC motif (Figure 
3), APOBEC3G (A3G) favors CCC. 
Experimental grafting of the recogni- 
tion loop from one to the other will 
change the sequence specificity. When 
the AID loop was grafted into an A3G 
background, the chimeric enzyme was 
still efficacious in restricting effective 
HIV infection despite the difference in 
targeted sequence. In contrast, when the 


recognition loop of AID was changed, 
it adversely affected both SHM and 
CSR. Both the CDRs targeted in SHM 
and the S regions targeted in CSR are 
enriched with the WRC motif. This 
is accomplished within the CDRs by 
a preferential use of codons for serine 
(Ser), for example, that result in WRC 
hotspots, while in neighboring regions 
codons are preferentially used that do 
not create hotspots (Nabel et al., 2014; 
Kohli et al., 2010). 

Among the other domains in the 
AID enzyme are a region required spe- 
cifically for CSR (positions 190-198), 
another required for SHM (13-23), and 
a nuclear export signal (NES, 183-198). 
The latter plays an important role in 
maintaining AID in the cytoplasm, thus 
preventing it from damaging DNA when 
it has not been specifically recruited to 
provide an appropriate function (Barreto 
and Magor, 2011; Zan and Casali, 2013). 


Keeping AID Where It Belongs 
The hydrophilic residues in the NES (C 
terminal) portion of AID are essential for 
its active exclusion from the nucleus. It 
is retained in the cytoplasm, where it 
is complexed with other molecules to 
stabilize it until it is actively imported to 
the nucleus to perform its required func- 
tion. It appears that there is a ubiquitin- 
dependent and independent pathway by 
which nuclear AID can be targeted for 
degradation (Barreto and Magor, 2011; 
Zan and Casali, 2013; Chandra et al., 
2015). Phosphorylation of AID Ser3 is 
one factor that contributes to its degra- 
dation. Additionally, AID abundance 
in the nucleus is positively correlated 
with its catalytic activity (Matthews et 
al., 2014; Le and Maizels, 2015). 
Somatic mutations in B cells are 
initiated during the G1 phase of the 
cell cycle. This is the growth phase 
that occurs prior to DNA replication (S 
phase), which precedes mitosis. When 
AID is experimentally sustained in the 
nucleus during the S phase, cell viability 
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Figure 3. Nucleotides are classified as pyrimidines (Y) or purines (R) based on 


their ring structure. A mutation from one Y to another Y, or one R to another R is 


a transition; the ring structure stays the same. Mutations that result in a nucleotide 


with a different ring structure are called transversions. Nucleotides can also be 


classified as weak (W) or strong (S) based on the number of hydrogen bonds (2 
or 3) in pairing. AID prefers the motif WRC (T or A; A or G; C) 


is compromised; this explains why AID 
is normally rapidly degraded in the 
nucleus outside the G1 phase. It has 
been suggested that the nick left after the 
cellular machinery removes the foreign 
base, dU, may lead to double-stranded 
breaks if not repaired before replication. 
If so, this may account for the AID-de- 
pendent translocations characteristic of 
B-cell lymphomas where AID expression 
is no longer normally controlled (Le and 
Maizels, 2015). 

Phosphorylation of AID Ser38 is 
necessary for somatic hypermutation 
(SHM) and class switch recombination 
(CSR), apparently to enable interaction 
with other molecules essential to these 
processes. In SHM, pS38-AID interacts 
with the single-stranded DNA (ssDNA) 
binding replication protein A (RPA), 
which stabilizes the ssDNA substrate 
that is the target of AID activity. In CSR 
pS38-AID recruits RPA and has been 


shown to interact with apurinic/apy- 
rimidinic endonuclease (APE), which 
is required for making the breaks in the 
DNA. Interestingly, double-stranded 
breaks promote AID Ser38 phosphoryla- 
tion, suggesting a positive feedback loop 
amplifies activity in S regions (Kumar et 
al., 2014; Matthews et al., 2014). 

AID recruitment is transcription 
dependent. The transcripts through the 
V region (in SHM) or S region (CSR) 
are not translated, but in at least the lat- 
ter case they are spliced. Deletion of a 
splice donor site was shown to interfere 
with CSR, suggesting that the transcripts 
might perform a regulatory function in 
some cases. During transcription, RNA 
polymerase II (Pol II) is stalled, and a 
factor involved in Pol II elongation and 
stalling, Spt5, has been shown to recruit 
AID. Additional adapter proteins have 
been found to recruit AID through their 
interaction with the abundant AGCT 


repeats (AGC being a subset of WRC) in 
the S region. Several other factors have 
been shown to be involved in recruiting 
AID as well. In fact, it has been com- 
mented that a surprisingly high number 
of cofactors are implicated despite the 
small size of the AID molecule, reflect- 
ing its tight regulation (Zan and Casali, 
2013; Matthews et al., 2014; Chandra 
et al., 2015). 

It is not difficult to understand how 
transcription allows AID access to the 
non-template strand; however, AID 
accesses both strands, which allows 
for the deaminated residues to be con- 
verted into double-stranded breaks for 
CSR. The RNA exosome complex has 
been shown to associate with AID and 
accumulate on S regions in an AID- 
dependent manner. This macromolecu- 
lar complex removes and/or degrades 
nascent RNA on the template strand at 
stalled Pol II sites, exposing ssDNA for 
AID to access (Matthews et al., 2014; 
Chandra et al., 2015). 

Epigenetic factors are also associ- 
ated with AID recruitment. Methylated 
dCs make poor substrates for AID, and 
various histone modifications have 
been identified as playing a role in AID 
recruitment. A recent study evaluating 
patterns in both normal and off-site AID 
targeting found that regions enriched 
with chromatin modification typical of 
active enhancers, such as histone H3 
acetylated at lysine 27 (H3K27Ac), as 
well as modifications typical of active 
transcription, such as trimethyl histone 
H3 lysine 36 (H3K36me3), mediate AID 
recruitment. Several transcription-factor 
binding sites are implicated in recruit- 
ing AID. Most AID targets are grouped 
within super-enhancers and regulatory 
clusters (Zan and Casali, 2013; Mat- 
thews et al., 2014; Chandra etal., 2015). 


Diversity in Outcome via 
Different Repair Mechanisms 
Despite the fact that gene conversion, 
SHM, and CSR in B cells all require 
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Figure 4. A methyl group (CH,) can be added to deoxycytidine (dC; the cyto- 
sine residue in DNA) as an epigenetic tag to help regulate gene expression. At 


times the methyl group needs to be removed, and AID is sometimes involved. 
AID deaminates 5-methylcytidine (5mC) to thymidine (dT). This creates a T:G 
mismatch, which is subsequently repaired back to C:G, leaving an unmethylated 


cytosine residue 


AID to initiate the process, the outcomes 
are very different. This is the result of 
recruiting a different array of proteins to 
process the dU lesion that AID creates. 

In SHM, according to the current 
model, there are three possible pathways 
to repair the AID induced dU:dG mis- 
matches. Replication prior to interac- 
tion with other repair enzymes results 
in transition mutations (CG > TA). 
Alternatively, removal of dU by the base 
excision repair (BER) enzyme uracil 
DNA glycosylase (UNG) prior to replica- 
tion results in an abasic site. Subsequent 
repair during replication by error-prone 
DNA polymerases can lead to transition 
or transversion mutations. Otherwise, 
dU:dG mismatches can be processed 
by mismatch repair proteins, followed 
by filling in the gap with an error-prone 
polymerase, resulting in mutations at 
neighboring A:T residues and/or short 
indels (Matthews et al., 2014; Kumar 
et al., 2014). 

It is important to recognize that 
error-prone polymerases are an essential 
part of the arsenal of polymerases used 
by cells to maintain genomic stability. 
They are specifically recruited to sites of 


DNA damage that the high processivity, 
high fidelity (i.e., fast, and accurate) 
polymerases cannot handle. In many 
(but not all) cases they accurately repair 
the lesions, though in the case of the 
immune system, they are recruited to 
induce changes (Saugar, et al., 2014; 
Yang, 2014). 

According to the current model for 
CSR, dU introduced by AID is removed 
by the BER enzyme UNG (Figure 2). 
The abasic site is then converted to a 
ssDNA break by APE. A similar nick 
nearby on the opposite strand results in 
a staggered, double-stranded break. It 
has also been found that components 
of the mismatch repair pathway can act 
on dU:dG mismatches to form double- 
stranded breaks. ‘These breaks are then 
repaired by nonhomologous end join- 
ing (Matthews et al., 2014; Kumar et 
al., 2014). 

Theoretically, there are several ways 
the loose ends can be rejoined during 
CSR. For example, the intervening 
segment between the two S regions con- 
taining double-stranded breaks could be 
inverted, which would not result in a 
functional antibody. However, it appears 


there are certain features of the S region 
and AID designed to facilitate proper 
joining of the segments. The majority 
of the time the intervening segment is 
circularized, and the V(D)J region is 
correctly attached to the new C region 
(Dong et al., 2015). 


Other Roles of AID 


In addition to its roles in DNA sequence 
diversification in the immunoglobulin 
genes of B cells, AID appears to have 
other important functions. Methylation 
is a common epigenetic tag that helps 
define gene expression patterns es- 
sential to life. AID had been shown to 
deaminate 5-methylcytosine (5mC) to 
thymidine (dT), though the efficiency of 
this reaction is at least an order of mag- 
nitude lower than its normal substrate 
(Figure +). When this reaction takes 
place, it results in a T:G mismatch that 
can be processed by glycosylases and 
downstream BER enzymes to restore 
an unmethylated C. Currently, there 
are conflicting conclusions on the rel- 
evance of this reaction in vivo based on 
studies. One recent summary suggests 
that there is no strong evidence for AID 
in genome-wide demethylation, but it 
appears to play a role in gene-specific 
demethylation that underlies cell dif- 
ferentiation (Ramiro and Berreto, 2015). 
AID is also important in B-cell toler- 
ance, and lack of the enzyme is associ- 
ated with autoimmune disease. This is a 
rather paradoxical phenomenon, where 
humans lacking AID not only suffer 
from infections because they cannot 
mount a normal antibody response but 
also suffer autoimmune disease due to 
the inability to remove autoreactive B 
cells. In this role AID is expressed in 
immature B cells along with RAG2, 
though many details of how they elimi- 
nate autoreactive B clones remain to be 
elucidated (Cantaert et al., 2015). 
Obviously, although AID has numer- 
ous crucial functions, loss of control 
over AID or the associated DNA repair 
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pathways can have disastrous results. 
Hypomethylation, point mutations, 
indels, and structural rearrangements 
are all features that are associated with 
cancer. Off-site activity of AID appears 
to be one factor that can contribute to 
carcinogenesis in certain malignancies 
such as lymphoma (Dominguez and 
Shaknovich, 2014; Pettersen etal., 2014). 

Interestingly, in addition to muta- 
tions driving oncogenesis, cancer cells 
carry many thousands of passenger 
mutations not directly related to disease 
progression. With the increased avail- 
ability of rapid-sequencing technologies, 
scientists have examined the patterns of 
mutations in a variety of cancer types 
to understand the factors involved. 
Different processes leave a different 
“mutational signature” depending on 
the exogenous or endogenous DNA 
damaging agents, as well as the repair or 
replicative pathways that follow (Helle- 
day etal., 2014). Perhaps there are other 
places where endogenous enzymes are 
playing an important functional role as 
they alter the DNA sequence. If so, these 
signatures identified in cancer studies 
could help identify those places. It may 
be that the presence of AID in oocytes, 
spermatocytes, and embryonic cells 
is related to the induction of adaptive 
germ-line mutations. 


AID: A Role in Adaptation? 


Detection of AID in primordial germ 
cells, embryonic stem cells, and sev- 
eral other cell types was the impetus for 
investigating a possible role of AID in 
demethylation (Matthews et al., 2014). 
While AID does appear to sometimes 
play a role in demethylation, it could 
play another role in these cells. It has 
been suggested that it plays a role in 
meiotic recombination. SPO1]1 is an 
important enzyme that initiates double- 
stranded breaks during meiotic recom- 
bination. In some assays, AID appears 
to partially rescue SPOIL] deficiency 
(Barreto and Magor, 2011). However, 


no statistical difference was noted in the 
average recombination events between 
normal and AID null mice (Cortesao 
et al., 2013). A third possibility that has 
been suggested is that AID may still play 
an APOBEC3-like role in controlling 
transposable element movement in 
some species (Barreto and Magor, 2011). 

Another possibility exists: AID may 
be purposefully recruited to germ cells 
for DNA editing. In other words, en- 
zymes such as AID and/or mutagenic 
repair pathways may be involved in the 
purposeful formation of adaptive alleles. 
It has already been noted that homolo- 
gous recombination (crossing over and 
gene conversion) is mutagenic and that 
this is associated with adaptive mutations 
in bacteria. E:rror-prone TLS polymer- 
ases and/or error-prone repair pathways 
have been shown to play a role (Malkova 
and Haber, 2012). Given the purposeful 
nature of mutations induced by these 
mechanisms in adaptive immunity, it is 
quite plausible that genetic adaptation 
has a similar underlying basis. 

If many germ-line mutations are 
purposeful, then it is expected that vari- 
ous factors will eventually be identified 
that govern the targeting of sites for 
mutagenesis and recruiting of appropri- 
ate proteins. Physiologic adaptation is 
characterized by changes in gene expres- 
sion, which is mediated by epigenetic 
changes. Adaptive alleles often arise 
in the same genes as those targeted in 
physiologic adaptation (Lightner, 2014). 
Since transcription and associated epi- 
genetic changes are important in recruit- 
ing AID, it may be that these factors play 
a role in targeting various regions of the 
genome for adaptive genetic changes. 

Further, there may be purposeful 
mechanisms to increase the frequency 
of adaptive alleles in the population. 
Meiotic drive refers to any process that 
distorts Mendelian inheritance by pref- 
erentially transmitting one haplotype 
(or allele) over another when gametes 
are formed by meiosis. Biased gene 
conversion is one example; it can result 


from the break being preferentially in- 
duced on one strand of DNA over the 
other. Other downstream factors, such 
as the factors recruited to repair the 
break, can be involved as well. In some 
cases, this biased transmission is associ- 
ated with, and perhaps influenced by, 
single nucleotide polymorphisms (SNPs; 
Odenthal-Hesse et al., 2014). 

The existence of meiotic drive has 
significant implications for the evolu- 
tionary assumption that natural selection 
is a major player in adaptation. It has 
long been known, based on mathemati- 
cal models that natural selection cannot 
account for diversity in vertebrates, even 
in an evolutionary time frame (Hal- 
dane, 1922; Kimura, 1968). Further, 
mathematical modeling suggests that 
genetic drift eliminates the majority 
of rare beneficial alleles. Finally, the 
phenotypically based prospective study 
of natural selection in Galapagos finches 
showed natural selection acted only at 
discrete times of harsh environmental 
conditions and was not consistent in 
direction (Lightner, 2015). 

Despite this, there have been many 
genetic studies that have relied on 
statistical tests that suggest that natural 
selection has occurred, even within 
groups that creationists would say belong 
to the same kind. Yet these tests usually 
assume Mendelian inheritance, and 
the existence of meiotic drive violates 
those assumptions. Thus, meiotic drive 
is likely to be a well-regulated, designed 
mechanism (evolutionists currently as- 
sume it is random) that accounts for the 
statistical patterns normally attributed to 
natural selection (Lightner, 2015). 


Summary 


Historically, evolutionists have insisted 
that adaptation takes place by the natu- 
ralistic mechanisms of random genetic 
mutation and natural selection. These 
are philosophical assumptions based on 
a worldview that rejects a Designer, not 
something that was demonstrated scien- 
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tifically. In the creation model, universal 
common ancestry is rejected, but there 
is diversification and speciation that has 
obviously occurred within many created 
kinds. The appearance of adaptive al- 
leles in various populations around the 
world suggests that there are designed 
mechanisms by which these alleles arise. 
A look at the adaptive immune re- 
sponse indicates that the body has the 
ability to edit DNA in a variety of ways to 
meet environmental challenges. Several 
enzymes, including AID, are designed 
to initiate alterations in the DNA se- 
quence. Then a variety of outcomes are 
possible depending on which proteins 
are recruited to resolve the aberrant 
base or DNA breaks that were induced. 
The whole process is well designed and 
tightly regulated. This bolsters the idea 
that similar designed mechanisms are in- 
volved in adaptive germ-line mutations. 
Additionally, AID has been detected 
in ovaries, spermatocytes, and embry- 
onic stem cells. While AID does appear 
to play a limited role in demethylation, 
which may at least partially explain its 
presence in these locations, this also 
leaves open the possibility that AID may 
play other roles. There is evidence it can 
partially cover for the loss of the enzyme 
SPOL1, which is normally involved in 
meiotic recombination. AID, or other 
DNA editing enzymes, may be actively 
recruited during meiosis, which could 
help explain why homologous recombi- 
nation is mutagenic beyond the normal 
crossing over and gene conversion. 
Further, it is recognized that genes 
that are involved in physiologic adapta- 
tion are often the same genes involved in 
adaptive genetic mutations. Physiologic 
adaptation involves epigenetic changes 
that up- or down-regulate genes to com- 
pensate for an environmental challenge. 
AID can target regions where there is 
active transcription and other epigenetic 
signals, which suggests physiologic ad- 
aptation may plausibly be an important 
prerequisite if AID in fact does play a 


role in adaptive germ-line mutations. 


Since such mutations leave “signatures,” 
it may be possible to bioinformatically 
screen for regions of the genome where 
adaptive mutations were induced by 
AID ora similar DNA editing enzyme. 
Finally, when new adaptive alleles 
appear in a population, there needs to 
be an effective means for them to spread. 
Natural selection could play some role, 
but there are multiple lines of evidence 
suggesting that it is not particularly ef 
fective. The reality that meiotic drive, 
a type of non-Mendelian inheritance, 
exists suggests that it may play an im- 
portant role in increasing the prevalence 
of adaptive alleles within a population. 
All these considerations point to 
potentially fruitful lines of research. 
While the intelligent-design framework 
does recognize design is present in the 
genome, the history in Genesis provides 
background information suggesting sev- 
eral important places to look for this de- 
sign. If AID and/or similar enzymes play 
a role in adaptive germ-line mutations, 
it would be one mote line of evidence 
that adaptation occurs because of the 
Creator, the God described in the Bible, 


who cares for His creatures. 
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Cells as Information Processors 


Part I: Formal Software Principles 


Royal Truman* 


Biological research and interpretation 
have been dominated by philosophical 
naturalism for almost two centuries, es- 
pecially when considering the question 
of origins. Deliberate design is often 
rejected as unscientific, which leads to 


Abstract 
C ells perform millions of Boolean logic operations every second 


using multiple independent codes with stringent formal rules 
instantiated on DNA, RNA, proteins, sugars, and membranes. These 
codes rely on elementary and concatenated symbols to define variables 
and values that can be written, deleted, and read from long- and short- 
term memory. Computer and cellular variables are used with control 
structures such as “Golo,” subroutine calls, “wait,” and to initiate 
and terminate iteration loops. They have well-defined data types and 
allowed operations. Values can be structured in arrays and linked lists. 
Although variables are identifiable in cells, logic is executed with- 
out a readable source code, using hardwired biochemical components 
and inherited molecular machines (MMs). Each code requires unique 
decoding MMs, and cellular codes interoperate to incorporate details 
located throughout the cell to permit holistic correct decisions. Tight 
integration between these codes is implemented using adaptor bio- 
molecules. DNA, RNA, and proteins are used to define both variables 
and values for independent codes, often in overlapping regions. These 
biomolecules are also needed to create MMs, adaptors, and the rest of 


the infrastructure. 


even absurd proposals being entertained 
since “something must have happened.” 
This is remarkable, since we interact 
daily with a world affected by conscious 
decision making. If we found a com- 
puterlike object on Mars, most would 
not insist on finding an explanation 
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limited to deep time, random mutations, 
natural selection, chemistry, and physics. 
Although it would be possible to also 
explain the actions of a chess-playing 
program post-facto by tracing a series of 
internal mechanistic steps, this explana- 
tion would be incomplete. It would fail 
to explain the innate ability to anticipate 
and solve novel complex problems. 
Prokaryote and eukaryote cells 
can do far more than a chess-playing 
program, being able to solve an aston- 
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ishing variety of unrelated problems 
concurrently. A seemingly endless 
list of contingencies has been antici- 
pated, even when the exact details were 
never encountered before by the cell 
or its ancestors. Flexible categories of 
problems have been foreseen. Cells 
perform logic processing in a manner 
surprisingly similar to computers, using 
codes, structured datatypes, variables, 
algorithmic constructs such as Boolean 
logic and iteration, and a hierarchy of 
sophisticated data storage strategies for 
short- and long-term memory. Ignor- 
ing this integrated, holistic aspect of 
cells and insisting on a reductionist 
neo-Darwinian explanation for every 
cellular feature prevents answering the 
relevant questions correctly: Where did 
they come from, and why are they there? 


Interpreting Biological 
Change and Development 
Many complex processes exhibited by 
living systems suggest an intention or 
purpose. Examples include migration of 
birds to specific locations during certain 
time periods, development of adults 
from a fertilized cell, metamorphosis of 
caterpillars into butterflies, and execu- 
tion of a strategy based on mental pro- 
cesses. This led philosophers long ago to 
embed purpose in physical objects as a 
form of internal will. Aristotle identified 
four kinds of causes for movement and 
change in general—the material, for- 
mal, efficient, and final—and claimed 
in Book II of Physics that a stone falls 
because it has an internal nature that 
drives it to attain its natural state. Many 
prominent thinkers since then have tried 
to interpret the specialness of living sys- 
tems using notions such as a “formative 
drive,” “living principle,” “life-energy,” 

“entelechy,” and “teleology.” 

Currently, however, science has 
become dominated by reductionist and 
mechanistic thinking typified by books 
such as Jacques Loeb’s The Mechanistic 
Conception of Life published in 1912 


and the works of behaviorist psycholo- 
gists — in particular B. F. Skinner—who 
deny the existence of will and mental 
states that perceive and direct behavior. 
This misguided naturalist thinking 
distorts much of what we observe and 
experience. Purpose and guidance are 
apparent and need to be taken into 
account. The existence and operation 
of an orchestra, growth of trees, poker- 
playing programs, and so on cannot be 
adequately explained by extrapolation 
from the natural behavior of many 
atoms. Wilhelm Dilthey (1833-1911), 
prominent philosophy professor at the 
University of Berlin, had a special in- 
terested in scientific methodology and 
introduced a distinction between the 
humanities (Geisteswissenschaften) and 
natural sciences (Naturwissenschaften). 
He argued correctly that investigative 
methods are often being applied in areas 
they are unsuitable for. 

Purpose and guidance in nature 
need to be revisited. In this two-part 
series, we will examine how intent is 
governed in cellular processes, using 
digital computers as a model. We will 
show formal software principles are in- 
volved, which are processed by hardware 
molecular machines (Scruton, 1996, p. 
254). University of Chicago microbiol- 
ogy professor James Shapiro referred to 
such stored instructions in a recent lec- 
ture, pointing out, “Cells use cognitive 
processes (=action based on knowledge) 
in dealing with genomic information” 
(Shapiro, 2011). At the conclusion of 
this analysis, we are reminded of Aris- 
totle’s claim that we cannot understand 
any cause for change until we can 
deduce its purpose (Stangroom and 


Garvey, 2005, p. 17). 


Examples of Complex 
Programs in Cells 
Prokaryote and eukaryote cells contain 
hundreds of integrated and carefully 
regulated programs such as metabolic 
networks and signal cascades linking 


the environment with gene regulation. 
Complex multicellular organisms dis- 
play gene regulatory networks to unfold 
developmental programs and generate 
nervous systems and brain microcir- 
cuitries (Markram etal., 2015). We will 
examine these and other examples be- 
low and in the next paper. In all cases 
well-defined, logic-processing steps are 
involved, which channel the outcomes. 


_ Coded Information Systems 
Ina series of papers, Truman introduced 
the theory of coded information systems 
(CISs), a framework to interpret how 
information-driven systems work. A CIS 
consists of linked tools or machines that 
refine outcomes to attain a specific goal 
(Truman, 2012a, 2012b, 2012c, 2013, 
2015) (Figure 1). A coded message must 
play a prominent role between at least 
two members of these linked processes 
to demarcate from simple machines. 
Messages satisfy rules and strict for- 
malisms to be interpreted reliably and 
provide flexibility and multipurposes 
(Hofstadter, 1980, p. 26). 

Intended outcomes are ensured 
in a CIS through refinements using a 
combination of four possible “refine- 
ment factors”: coded messages, sensors, 
physical hardware, and preexisting re- 
sources such as data or logic-processing 
algorithms. The model is quantitative, 
measuring the decreased entropy with 
respect to a reference state between each 
refinement step. 

Often the CIS first increases the 
range of possible outcomes before 
applying constraining processes. To 
illustrate, the coding portion of a par- 
ticular gene specifies a subset of useful 
protein sequences. How has entropy 
been decreased? The reference entropy 
to compare against is the variety of poly- 
peptides that could be generated thanks 
to the cellular environment (without 
DNA, RNA polymerase, ribosome, ATP, 
tRNAs, and other resources, these long 
linear chains do not form naturally). The 
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Figure 1. Coded Information Systems sequentially refine behavior through a series 


of processes. Each goal-directing refinement step could be influenced through 


coded messages, sensors, physical hardware, or preexisting resources such as data 


or logic-processing algorithms. At least one process must be guided by coded 


instructions to be a CIS. 


reduction in the entropy of the reference 
sequences versus the sequences coded 
by a gene for a specific purpose defines 
the information gain. 

CIS are often embedded hierarchi- 
cally. The F,, region of ATP synthase 
is a component of the ATP synthase 
molecular machine, which is embedded 
in a mitochondrion, which is part of a 
cell, which is part of an organ, which 
itself is part of an integrated organism 
that contributes to an ecological CIS. 
Coded messages communicate inten- 
tion between members of the system. In 
eukaryotes, many subsystems comprise 
an individual organism, whereas in 
prokaryotes there is more distribution 
of effort between collaborating species 
in an ecology with exchange of signals 
and genetic materials via passive uptake 
of DNA (Claverys et al., 2006), conjugal 
transfer, viral transduction, and other 
lateral gene transfer mechanisms (Stan- 
ton, 2007). 


Indications Cells 
Could Be Computerlike 


Modern computer architectures (Von 
Neumann architecture, n.d.) remind us 
of cells. DNA provides long-term storage, 
and the data are not randomly thrown 
together but sensibly structured, even as 
computers use file systems to organize 
related data. Genes in prokaryotes that 
need to be co-expressed are often located 
together and controlled by an operon 
(Osbourn and Field, 2009). In a recent 
study, for every eukaryote analyzed, gene 
order was not statistically random, but 
often those having similar and/or coor- 
dinated expression are clustered (Hurst 
et al., 2004; Michalak, 2008; Chu et al., 
2011). Just as data on computer hard 
disks are stored in sectors, Alu-sequence 
containing nucleosomes define regions 
of the DNA (Salih et al., 2008; Trifonov, 
2011). 

DNA is a read/write/delete system. 
Data can be reorganized by transposons 


and content added via CRISPR (Clus- 
tered Regularly Interspaced Short Pal- 
indromic Repeats) (Zetsche et al., 2015; 
Ran et al., 2015), lateral gene transfer, 
and transfer of plasmids in prokaryotes. 
Genomes can also be contracted by 
deletions, such as the removal of trans- 
posable elements (van de Lagemaat et 
al., 2005). Portions of DNA are read 
many times and converted to mRNA 
copies—short-term memory—where 
logic processing is performed. Further- 
more, mRNA codons specify amino acid 
sequences, so clearly a code exists. 

We will focus here in Part | on 
formal software features like data types, 
data structures, codes, and algorithms, 
which are useful to solve problems 
using abstract methods, independent 
of the hardware implementation. The 
hardware aspects used by cells will be 
examined in Part 2. 


Key Principles to 
Understand How Cells Work 
Before showing that cells use formal soft- 
ware constructs, we need to devote some 
effort to eliminate a few misunderstand- 
ings and introduce some guiding in- 
sights: DNA does not provide an explicit 
prescriptive source program readable 
by humans; multiple codes are in use; 
each code requires a distinct alphabet 
and hardware decoder; software and 
hardware are far more integrated than in 
digital computers; and logic processing 

is distributed and hierarchical. 


DNA Does Not Provide 
an Explicit Prescriptive 
Source Program 

Many still erroneously believe DNA 
contains a prescriptive language con- 
taining a complete blueprint or “Book of 
Life:” that specifies in detail the develop- 
ment of an organism. As Woodward and 
Gills wrote recently, “This is the shock 
of shocks: that the DNA alone does not 
play the part of the director” (2012, p. 
75). This contrasts with computer pro- 
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grams, whose logic can be understood 
from the source code. Consider as an 
example (1): 
if (A=5 and B=’ red’ 
not C=’Deactivate’ ) 
then { ‘execute follow- 
ing instructions’ } (1) 


and 


A line of readable coding such as (1) 
will not be found in DNA or elsewhere 
in a cell, but the variables can be iden- 
tified, and logic operations are indeed 
being performed. Can we discern the 
Boolean logic and resulting process- 
ing being performed? Yes, empirically. 
Consider as an example of the variables 
A,B, and C three cis-regulatory elements 
(CREs, specific nucleotide patterns on 
DNA). Each value is defined by which 
transcription factor (TF, a protein) is 
attached or “nothing is attached.” The 
logic being performed can be deci- 
phered by systematically varying the 
values (Davidson, 2006) and simulated 
with computer programs. 

The logic is implicit but very real, 
and built into the system as whole, and 
for good reasons. Cells have far greater 
functionality than computers. ‘They can 
replicate autonomously, generate their 
own energy, repair themselves, manufac- 
ture and recycle the substances needed, 
produce their processing hardware, 
and interact dynamically to provide 
emergent behaviors, even committing 
suicide (apoptosis) when necessary 
for the common good. An inheritable, 
error-free source code program to cover 
all these details and eventualities would 
not be feasible. Instead, cells replicate 
only the variables and their values, plus 
a functional copy of the necessary hard- 
ware each generation. 

This strategy provides less opportu- 
nity for information corruption com- 
pared to specifying all the steps in precise 
detail in order to assemble thousands 
of cellular components, test the timing 
of location and progress of activity, and 
then mandate corrective action to take. 
We complete the explanation in Part 


2 by showing how judicious organiza- 
tion—and inheritance — of the hardware 
components provide informative con- 
tributions and thereby reduce what the 
software needs to communicate. 
Francis Crick was wrong when he 
claimed the genome was the (sole) 
source of phenotypic information 
(Crick, 1970). We can show this in 
many ways. A consequence of RNA 
editing, trans-splicing, and other post- 
transcriptional RNA modifications 
is that the modified sequences can 
undergo reverse-transcription and be 
introduced into the DNA germ line 
(Moller-Krull et al., 2008). Furthermore, 
changes in chromatin (which do not 
alter DNA sequences) can be inherited 
later over multiple generations (Jaenisch 
and Bird, 2003). In fact, somatically 
heritable chromatin structures are one 
way to establish differentiated cell lines 
(Gurdon et al., 2003). Further evidence 
that DNA does not directly prescribe 
final outcomes includes the existence of 
multiple life stages such as invertebrates 
having distinct larval and adult stages 
and other examples of metamorphosis. 
In the next paper, which accompanies 
this one, we describe the cell as an in- 
teracting set of controlling subsystems, 
each with its own coded variables, and 
less as a hierarchical or cascading design. 
Much of what is necessary in the cell 
is not directly guided by DNA (Barbieri, 
2003, p. 31). Globular proteins work 
only after they fold properly, which is 
affected by factors such as fluidity of the 
environment, how fast different sections 
are translated in a ribosome (Spencer 
et al., 2012), and the contribution of 
chaperones. Even after proteins form, 
additional guidance is provided, not 
by DNA, but by ligands, which are ju- 
diciously attached and removed. Gene 
regulatory networks develop automati- 
cally upon activating/deactivating CREs 
that are passively poised in anticipation. 
If one or more ‘TT's activate a particular 
CRE, the resulting protein (a new TF) 
can activate or deactivate the same or 


different CRE(s), eventually leading 
automatically to mutually interacting 
circuits with no a priori guidance from 
explicit coded instructions. 

RNAs can also behave as informa- 
tive riboswitches. A small molecule 
binds to part of the RNA (the aptamer), 
which causes an allosteric change in 
another portion of the RNA called the 
expression platform, which can regulate 
gene expression (Serganov and Patel, 
2007). There are many more examples 
of information processing that do not 
involve exclusive and direct guidance 
by DNA, such as aggregation of surface 
receptors in response to ligands (Wulfing 
etal., 2002; Bray and Duke, 2004; Murai 
and Pasquale, 2004) and cytoskeletal 
reorganization (Pollard and Borisy, 2003; 
Pelkmans, 2005). 

There are cases, or course, where 
outcomes are partially specified directly 
by DNA, such as the N-end rule, where- 
by the halflife of proteins is determined 
to a large extent by the identity of its 
N-terminal residue. Sometimes DNA 
provides parameters less obviously such 
as in protein and vesicle targeting to 
distinct cellular locations (Bonifacino 
and Glick, 2004; Pool, 2005) and pro- 
tein export from cells (Neel et al., 2005; 
Stuart and Ezekowitz, 2005). Here the 
signal sequences are extremely variable, 
both in length and amino acid composi- 
tion, and the parameters are generated 
sometimes by remote parts of proteins 
brought together only after folding. This 
variability could be necessary for various 
processing details including additional 
post-targeting functions (Hegde and 
Bernstein, 2006; Emanuelsson, 2002; 
http://psort.hgc.jp/). 

Evolutionists have generally argued 
that mutations are all that is needed to 
explain current cells. Distinguished 
Oxford professor Denis Noble, a force- 
ful critic of Dawkins’ reductionist views, 
pointed out that this is too simplistic: 
“Neo-Darwinism also privileges ‘genes’ 
in causation, whereas in multi-way 
networks of interactions there can be 
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Figure 2. Logic processing can occur by the sender before communicating coded data and after the receiver knows what 


should be done. Sometimes little or no reasoning is needed to generate the sender’s data, such as photons landing on retinas, 
and thereafter complex logic must be executed by the receiver to extract benefit from the data and decide what is to be done. 


no privileged cause” (Noble, 2015 p. 1). 

Does DNA determine outcomes by 
already possessing the necessary instruc- 
tions, or does it respond to signals from 
the cell (e.g., to replace proteins decided 
by the cell are needed)? We agree with 
Noble, who also wrote, “The causality 
is circular, acting both ways: passive 
causality by DNA sequences acting as 
otherwise inert templates, and active 
causality by the functional networks of 
interactions that determine how the ge- 
nome is activated” (2015, p. 9) and that 
“IF-THEN-ELSE” type instructions are 
found in cells (p. 10). 

An interesting consideration is where 
most of the decision making occurs in 
computers and cells (Figure 2). This 
issue arises in all sender-receiver forms 
of communication. In some cases, a 
message could provide very detailed 
instructions, and in other cases the mes- 
sage is (explicitly) minimally informative. 
When only variable values are commu- 
nicated, sometimes the sender performs 
considerable logical preprocessing and 
then only provides what is relevant 
(which the receiver can easily process). 
In other cases, raw data are made avail- 
able, and the receiver is responsible to 
make sense out of it. 

In the first example, we will consider, 
the sender has performed most the im- 
portant logic processing before sending 
the following coded data (2): 


(Co=’ IBM’; Nr_ stocks | 
to buy=510; When to_ 


buy='16 o’clock CET’) 





(2) 


The receiver now knows what to do 
(which stocks to buy, how many, and 
when). Considerable decision making 
occurs in cells in the sender environ- 
ment before the concentration and 
location of TFs are specified, and the 
results are communicated and processed 
as variable values by the relevant CRE's 
variables at the receiving side. 

In the next example, the receiver 
must perform much deductive process- 
ing, since variable values are commu- 
nicated whose significance need to be 
interpreted and evaluated (3): 

(Co=’ IBM’; Stock_ 
change in price=0.1; 
Weather=’ cloudy’; 
Winner=’ Manchester 
United’ ) (3) 


The receiver must now determine 
what is relevant and how it correlates 
with the decisions to be made. Human 
minds typically process raw data consid- 
erably before making a decision. 


What Is a Code? 


What is a code, and how does it relates 
to logic processing using variables? A 
code defines rules that translate physical 
or mental details—such as sounds, im- 


ages, pressure, size, quantity, intention, 
or even a different code— between two 
independent systems using an agreed- 
upon abstract convention of symbols. 
Speaking and writing are examples, 
bridging gaps in location and time. A 
simple causal outcome based on only a 
mechanical effect does not use a code, 
so an axe blow does not splita log in two 
thanks to a code that communicates in- 
tention. Whether to swing an axe could 
be communicated, however, using an 
arbitrary symbol convention such as 
{thumbs up /thumbs down}, {0 / 1} or 
{oui / non}. 

The sender and receiver can share 
the same symbol set (alphabet), like the 
International Flag Code for merchant 
ships and the use of ‘Co’ in (2) and (3) 
above. An example in cells is when a 
specific ‘TF (sender value) interacts di- 
rectly with a CRE (generating a receiver 
value). Another example is when a DNA 
template is used to generate DNA copies. 
The next nucleotide value to add to the 
growing chain is communicated directly. 

Alternatively, the sender and receiver 
could use different alphabets and vari- 
ables as long as there is an unambiguous 
way to map the symbol sets. In (2) above, 
the receiver could assign the value for 
“Co” to its own variable “Company” and 
also convert the time 16 o’clock accord- 
ing to its own time zone. This kind of 
linkage may require adaptor molecules 
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Figure 3. Communication between a sender and receiver system corresponds to 


transferring values to receiver variables. The alphabet of the sender (dark symbol 


in leftmost column) can differ from that of the receiver (dark symbol in rightmost 


column). In cells one or more adaptor molecule (middle column) may be needed 


to translate values between sender and receiver variables. The correct adaptor is 


identified through physical linkage with the sender variable’s value. 


A. 1:1 mapping between sender and receiver variable. 


B. 1:n mapping. 
C. n:1 mapping. 


(Figure 3) or messenger molecules in 
cells. 

If an informative ligand attaches to 
a TF, which then links to a CRE, this 
TF is now playing the role of an adaptor 
molecule. Another example of adaptor 
molecules are tRNAs in the genetic 
code, where one end identifies a specific 
mRNA codon (the value), and the other 
end translates to a corresponding receiv- 
er value (which activated amino acid to 
add to the growing protein). Linking the 
two systems through a chained network 
of signals permits additional factors to 
be taken into account that could refine 
the details during transfer. 

Additional variables can be used 
within the sender and the receiver side 


to perform necessary reasoning. These 
can be independent codes, but at their 
interface there must always be pre- 
agreed conventions with respect to the 
meaning of the variables and how values 
are communicated. A receiver could 
then process the values assigned to its 
internal variables and then become a 
new sender, transmitting values to anew 
receiver. A chain of sender/receivers can 
result, and examples in cells include 
signal cascades. 


Multiple Codes 
Are Used in Cells 
Gordon ‘Tomkins may have been the 
first scientist to propose that the genetic 
code is not the only code used in biology 


(Tomkins, 1975). Cell needs are com- 
municated by different codes found on 
DNA, RNA, proteins, filaments, sugars, 
cell membranes, and other cellular 
components. Occasionally the literature 
seems to incorrectly claim a code is 
involved, such as the so-called protein 
folding code (Dill etal., 2008), in which 
multiple local activities occur in a pre- 
cise order as part of the folding process. 
The difficulty in this case is identifying 
abstract variables upon which Boolean 
logic is performed. In this example, 
it seems that only physical chemical 
forces are occurring in a continuous and 
time-ordered set of steps. No variables 
are waiting to be assigned values nor 
anticipate activation. 

Each code has its’ own language and 
symbols. The genetic code to specify 
protein sequences is independent of the 
DNA-binding protein code to regulate 
gene expression (Hughes, 2008; Jolma et 
al., 2015) (Figure 4), even though both 
use DNA, and DNA codes sometimes 
share overlapping DNA nucleotides. 

Entire collections of CREs can be 
organized into cis-regulatory modules 
(CRMs), leading to DNA code variants, 
since each CRM uses a separate set of 
rules. Figure 5 shows a representative 
example, where three exon are regulated 
by five such CRMs (Davidson, 2006, p. 
49). Depending on time (e.g., develop- 
ment stage), input signals, and cell lin- 
eage, different modules can be used to 
interact with the key “proximal module” 
nearest to the transcription apparatus. 
This is a clear example of Boolean logic 
being applied. 

In addition, by using different read- 
ing frames, the same code sometimes 
provides different messages. This was 
examined in mathematical detail for the 
genetic code at a recent conference on 
biological information (Montaiiez etal., 
2013, pp. 139-167). In a remarkably can- 
did paper, we read that “although dual 
coding is nearly impossible by chance, 
a number of human transcripts contain 
overlapping coding regions” (Chung et 
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Figure 4. Representative example of cis-regulatory logic, showing the 2300 base-pair region preceding the coding region 
of gene endol6 of sea urchin. One or more proteins can bind to each of the cis-regulatory elements (gray boxes). The let- 
ters identify regions used for different purposes, such as regulation of key tissues during different phases of development 


(Davidson, 2006, p. 49-51). 


al., 2007). These multiple codes prompt- 
ed ‘Trifonov to point out, “The times of 
surrender to ‘junk’ and ‘selfish DNA’ 
are over, and ‘non-coding’ becomes a 
misnomer” (‘Trifonov, 2011, p. 2). 

We will not attempt an exhaustive 
listing of all cellular codes at this time, 
and the DECODE program continues 
to bring new ones to light, but we will 
mention a few to demonstrate that cel- 
lular codes define variables and their val- 
ues but not procedural code as humanly 
readable instructions. 

There is a tRNA charging code 
without which the genetic code cannot 
be implemented (Hou and Schimmel, 
1988; ‘Trifonov, 2011). 

The histone code (Young, 2001; Jen- 
uwein and Allis, 2001; Strahl and Allis, 
2000; Cosgrove and Wolberger, 2005) 
involves post-translational modifications 
such as ubiquitination, phosphorylation, 
mono-, di-, tri-methylation, acetyla- 
tion, sumoylation, and biotinylation of 
various residues on the four histones 
proteins (H2A, H2B, H3, and H4) that 
form the nucleosome. These tags regu- 
late gene expression and other processes. 
Specific histone modifications can iden- 
tify the need for DNA mismatch repair, 
for example H3K36me3 (histone H3, 
lycine number 36 receives three methyl 
groups) (Schmidt and Jackson, 2013) 
and H3K56 acetylation (Kadyrova et 
al., 2013). Hypoacetylation of H3K56 
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Figure 5. Multiple cis-regulator modules (CRMs) per gene, each composed of 
several CREs, permit independent regulation according to time, input signals 
and cell lineage. This typical example shows three exons (gray-checked boxes) 
regulated by five CRMs (black boxes). The CRMs are about 400 bp long, and 
the gene plus regulatory regions are spread out over about 30 kb of DNA. Alterna- 
tive looping brings the relevant regions together (Davidson, 2006, p. 49). A: The 
“proximal module” 3 interacts with CRM 5; in B it interacts with CRM 1, and 
in C it interacts with CRM 2. 


by enzymes HDACs | and 2 facilitate 
recruitment of nonhomologous end- 
joining (NHEJ) proteins (Miller et al., 
2010). One should not overlook that 


each cell type in eukaryotes uses its own 

histone code (Carey, 2012, p. 188). 
DNA methylation at the correct 

location identifies which sections of 
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DNA should be transcriptionally active 
euchromatin or inactive heterochroma- 
tin (Bird, 2002). 

The tubulin code involves various 
ligands that are added and removed to 
microtubules to affect several cellular 
processes (Verhey and Gaertig, 2007; 
Janke, 2014). 

The splicing code of eukaryote 
pre-mRNAs permits different exons to 
be combined to produce alternative 
proteins (‘Tejedor and Valcarcel, 2010). 

The nucleosome positioning codes, 
also called “Chambon rules” (Barash et 
al., 2010), are understood well enough 
to algorithmically automate their loca- 
tion to within one base for biological 
DNA sequences (Segal et al., 2006; 
Trifonov, 1980; Trifonov, 1981; Gab- 
dank etal., 2010). During development 
eukaryote genes are activated in a timed 
based manner using these codes for each 
primary transcript (Segal et al., 2006) 
to establish a regulatory circuitry that 
controls which genes are activated or 
silenced (Yuh et al. 1998). 

Interaction between genes has also 
revealed the Hox Code. Just a few Hox 
or homeotic genes control development 
of the body plan along the anterior- 
posterior axis. They code for transcrip- 
tion factors, which can either activate or 
repress large gene networks. The same 
transcription factor can repress one 
gene and activate a different one, and 
TFs are involved at many levels within 
developmental processes (Wellik, 2007). 
A typical regulatory region in eukaryote 
DNA is about 500 nucleotides long, on 
which four or five transcription factors 
can bind. On average eukaryote genes 
seem to have about three such regulatory 
regions (Bray, 2009, p. 191). 

The N-end code regulates the half- 
life of a protein using the identity of its 
N-terminal residue, which is determined 
from the moment they are produced 
(Varshavsky, 2011; Gibbs et al., 2014). 

In the sugar code, oligomers of 
carbohydrates serve as ligands for the 
transfer of information, acting with 


lectin protein receptors (Gabius et al., 
2011; Murphy et al. 2013). The large 
number of hydroxyl groups available 
offers enormous storage capacity, vastly 
more than the genetic code could (An- 
dré et al., 2015). 

The adhesive code (Readies and 
Takeichi, 1996; Shapiro and Colman, 
1999) uses differences in adhesiveness 
between neural cells in the primordial 
neuroepithelium to first establish seg- 
mentation and then the emergence 
of specialized structures such as brain 
nuclei, cortical layers, fiber tracts, and 
neural circuits using cadherins. 

A niche code has been proposed 
(Forsberg and Smith-Berdan, 2009). 
Hematopoietic stem cells (HSCs) must 
generate daughter HSCs and a variety 
of mature cells in response to stress in a 
regulated manner. HSCs are found in 
specialized niches in bone marrow, and 
there is a regulated adhesive interaction 
between niche cells and HSC compo- 
nents such as integrin, another example 
of adaptor molecules. 

Signal Transduction Codes are 
used when extracellular signals (“first 
messengers” such as hormones, neu- 
rotransmitters, and paracrine/autocrine 
agents) attach to a specific receptor on 
the cell membrane, activating a smaller 
number of second messengers such as 
calcium, cAMP, nitric oxice, and phos- 
phorylation cascades (Figure 6). One 
signaling molecule can cause many re- 
sponses such as the cell’s metabolism or 
gene expression, an example of |:n vari- 
able mapping mentioned in Figure 3). 

There is a vast research literature 
on this topic, and resources on signal 
transduction pathways are available 
on-line in databases such as “NetPath” 
for humans (http://www.netpath.org/). 
The latest research is correcting the 
view that simple linear cascades are 
used. Instead, large networks consisting 
of hundreds or thousands of proteins are 
involved (Walhout et al., 2013, p. 93). 
Note the rich potential to interact with 
other networks and codes to dynami- 


cally integrate multiple cell inputs and 
needs. 

The actin cytoskeleton uses adapter 
molecules to identify materials that 
should interact there, which implies 
a cytoskeleton code (Barbieri, 2003; 
Barbieri, 2008, chapter 8). 

The complex firing of neurons in the 
brain uses some kind of neural code or 
codes, since meaning is gleaned that 
permits the internal and external world 
to be understood (Nicolelis and Ribeiro, 
2006; Cessac et al., 2010; Jessell, 2000; 
Marquardt and Pfaff, 2001; Flames et 
al., 2007). In spite of intense interest, it 
is far from being understood. 

A phosphorylation code in Hedge- 
hog signal transduction has also been 
identified (Chen and Jiang, 2013; Ficz, 
2015; Schiibeler, 2015). 

The miRNA code can up or down 
regulate individual mRNA levels accord- 
ing to eukaryote cell type (Carey, 2012, 
pp. 191-194). 

A CpG epigenetic code in eukary- 
otes governs millions of methylations 
on DNA. When near the gene start site, 
transcription is blocked but in the gene 
itself enhances expression (Jones, 2012). 
In this read/write/delete system, DNA- 
methyltransferases (DNMT) add methyl 
groups, and there are many mechanisms 
to remove them in a tissue-specific man- 
ner. Methylation is most dramatic in 
the brain (Keverne et al., 2015). Most 
of the methyl groups are removed in 
the fertilized egg (zygote) (Lee et al., 
2014), otherwise the next generation 
would begin with a specialized and not 
pluripotent cell. 

The ventral neural tube is an 
example of special codes used in cells 
that interpret a gradient concentration. 
Distinct classes of neurons are produced 
in the ventral neural tube according to 
local concentration of Sonic Hedgehog 
(Shh) (Briscoe et al., 2000). 

Many secreted and membrane 
proteins contain N-terminal signal se- 
quences that communicate their target 
locations (Hegde and Bernstein, 2006). 
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Overview of signal transduction pathways involved in apoptosis 


Figure 6. Example of a signal cascade pathway, here involved in programmed cell death (apoptosis). (Source of diagram: 
Wikimedia Commons, the free media repository, https://commons.wikimedia.org/wiki/File:Signal_transduction_v1.png). 


See also Klipp et al., 2009, pp. 135-142. 





Codes in Cells Can Overlap 
Cellular codes often overlap and there- 
fore require degeneracy to not overly 
restrict each other. Since codes can be 
implemented using biochemicals which 
themselves rely on the genetic code, 
complex design tradeoffs are neces- 
sary. When planned correctly, the best 
implementation must be as robust as 
possible, taking into account the sever- 
ity of possible errors for all the affected 


codes (mutations, mistranslation, etc.). 


Degeneracy with respect to one 
code could be critically important for 
a different one. As an example, differ- 
ent codons could represent the same 
amino acid in the genetic code, but 
each codon can specify how rapidly 
that position is translated. Figure 7 
describes this using a section of Java 
programming. 

In probably all cases, assuming 
complete degeneracy for a code would 
be a mistake. Variants of a class of CRE 


could all be recognized by the same 
TF, but the CRE sequence differences 
specify how long and often to remain 
attached, in which tissue type, the tim- 
ing of activity during a cell cycle, and 
for what stages of development. 

The use of multiple and overlap- 
ping codes saves material and energy 
but is too constraining and requires 
too much foresight to find applicabil- 
ity in general purpose programming 
by humans. 
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public class TranslateCodons { 


public void ProcessEachCodon(String codon) { 


String c = codon; 
switch (c) { 
case "GCU": { 


// 1: Delay translation by time tl. E 


ibd sr 


/ 2: Add Alanine to the growing chain 


break; 
} 


case "CGU": { 


// 2: Add Arginine to the growing chain 


break; 
} 


// Remaining case statements... 





g., try {Thread.sleep(5);} catch (Exception e) {} 


Figure 7. Java example of codons being used for two unrelated purposes: to determine amino acid sequence and translation 


rate at that position of the mRNA. 


Each Code Uses Its Own Processor 

It is important to understand the distinc- 
tion between variables and the values 
they can assume. Cellular variables 
possess recognizable steric and elec- 
tronic features and wait for activation 
by a sender (which provides the values). 
For example, transcription in bacteria 
through RNA polymerase involves vari- 
ables, like the “sigma factor recognizing 
promoter” (the -35 and -10 elements 
located before the beginning of the 
sequence to be transcribed). As possible 
values these locations could be unbound 
or bound to one of several possible 
“sigma factors.” The sigma factor can also 
interact with a distinct set of promoters 
(Ishihama, 2000). 

For each coding system there are 
special processors designed to interpret 


the relevant values. When TT's bind to 
cis-elements to regulate translation, an 
appropriate three-dimensional proces- 
sor involving many proteins must be 
organized which can include direct or 
indirect adaptors (Zhou et al., 2015). 
The hardware aspect of cellular design 
is discussed in Part 2. 


Software and Hardware 
Tightly Integrated 
Unlike a Turing or von Neumann 
Machine (Von Neumann architecture, 
n.d.), cells must repair themselves, 
generate their own energy, adapt to 
new challenges, and reproduce autono- 
mously with all necessary components 
over many generations. The solution is 
a complete synergistic interaction be- 
tween the software and hardware. The 


physical DNA, RNA, and protein-based 
components that produce the hardwired 
biochemical processes are themselves 
constructed and replaced by relying on 
data provided through preexisting DNA, 
RNA and proteins. 

It is often easy to identify the physi- 
cal components of cells but overlook 
informational aspects. Fach 260 million 
photoreceptors on a human retina could 
be identified, but the semantic content 
implied by the photons landing on them 
is then funneled on to only 2 million 
connected ganglion cells before send- 
ing to the correct processing regions of 
the central nervous system (Gazzaniga 
et al., 2009). Here information is being 
interpreted, compressed, and transferred. 

As a second example, microtubules 
do much more than only maintain a 
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cell’s shape. Per microtubule a hundred 
thousand or more globular protein units 
grow in many directions and degrade 
constantly until coming into contact 
with a specialized region of a chromo- 
some centromere (Sullivan et al., 2001), 
or membrane, after a signal arrives 
there, at which point a firm attachment 
prevents degradation (Kirschner and 
Gerhart, 2005, pp. 148-152). These at- 
tachment regions are sensors (variables) 
that assume a value (i.e., when activated 
by the tip of the microtubule), that 
recruits proteins to produce a decoding 
molecular machine. 


Logic Processing Is 
Distributed and Hierarchical 
Different prokaryote species form eco- 
logical systems with necessary genes 
distributed among the members (Sonea 
and Mathieu, 2001), which is why a 
particular function requiring several 
genes can be assembled in one member 
through horizontal DNA transfer. Plas- 
mids in prokaryotes are another example 
of distributed information processing. 
In eukaryotes, information processing 
is also distributed, such as when bac- 
teria digest food separately from the 
host organisms’ germ line. Different 
cell lineages also distribute the effort, 
where each cell type has characteristic 
ensembles of activated and deactivated 
genes. Proteins, polysaccharides, lipids, 
and other substances are used to interact 
with receptors on cell surfaces and pro- 
vide communication signals to convey 
metabolic and developmental status 
back and forth (Aricescu and Jones, 
2007; Takada et al., 2007; Yamada and 
Nelson, 2007; Widelitz, 2005). Inter- 
cellular communication also occurs by 
molecular diffusion through air or water 
using gases, amino acids, oligopeptides 
and vitamins as signals (Bogdan, 2001; 
Chen et al., 2005; Fuqua et al., 2001; 

Chambon, 1995; Lazazzera, 2001). 
Hierarchical information process- 

ing also occurs. As examples, low-level 

logic processing occurs when individual 


DNA nucleotides define individual RNA 
nucleotides, and when codons specify 
amino acids. Once a protein has formed, 
additional processing occurs to transfer 
it to the correct cell location, later to 
integrate into molecular machines, 
enzymatic networks, and metabolic 
networks. Thereafter ever more complex 
features can develop, such as entire 
eukaryote organelles which themselves 
become part ofa properly regulated cell, 
on up to organs, whose operations must 
also be carefully regulated to permit a 
viable organism that interacts within 
an ecology. 

In addition to such hierarchical 
integration, we will see in the accompa- 
nying paper that many control systems 
in cells—each with their own codes —in- 
teract mutually within what often seems 
to be the same hierarchical level. 


Generic Insights from 
Computer Systems 
Architecture 
The explosive development of computer 
technologies is the result of collabora- 
tion between millions of scientists, en- 
gineers, and mathematicians worldwide. 
Fundamental to this success are interop- 
erability conventions and standards 
(such as the Open Systems Interconnec- 
tion model). ‘This permits specialists in 
various hardware and software areas to 
focus on and develop technologies from 
which integrated systems result. Using 
these design insights, we will interpret 
cellular behavior by examining software 
and hardware aspects individually and 
consider different levels in the system 

at which guidance is provided. 

Another insight humans have gained 
is the design of subsystems that can be 
assembled. We discussed lateral and hi- 
erarchical logic processing above (‘Than- 
bichler and Shapiro, 2008; Schneider 
and Grosschedl, 2007). An external 
printer can be built separately and then 
linked to the rest of the system. ‘To work 
properly the hardware devices often also 


require their own dedicated software 
(e.g., “drivers” must be installed). 


Software Elements Used to 
Implement Processing Logic 
Before examining software constructs 
used by computers and cells, let us con- 
sider a simple program to calculate the 

factorial of a number (Figure 8). 
Several general principles can be 

discerned. 

1. The programmer did not need to 
consider how the solution would 
be implemented on hardware nor 
the operating system details. Only 
the logic needs to be accurately 
expressed symbolically. 

2. There is a language with a precise 
grammar that contains several 
generic constructs—for example, 
iteration (with a defined starting and 
finishing value) and a Boolean test (if 
i has a value of n or less, then add 1 
and continue, otherwise terminate 
the iteration). 

3. The same processing logic could 
be applied with different values and 
meanings for the variable n. 

4. The algorithm could be copied into 
other programs and modified. 

5. The variables belong to a specific 
data type and have properties con- 
sistent with them. In the example, 
i and n must have an integer value: 
one cannot assign a value of “Smith” 
nor “True” to them. 

6. ‘The variables can represent real ob- 
jects, like dollar bills, but the choice 
of the symbols and what they do are 
physically independent of what they 
specify. 

7. The algorithm continues to make 
sense if each variable is replaced 
by another unique symbol. Even a 
three-dimensional abstract symbol 
could be used and the values as- 
signed could also be represented by 
no code currently in use by comput- 
ers. However, changes in hardware 
would then become necessary. 
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public class Factorial { 
public static void main(String[] args) { 
int n= 7; 
int result = 1; 
for (int i= 1; i1<=n; i++) { 


result = result * 1; 


} 


System.out.printIn("The factorial of ” +n +“ is " + result); 


j 
j 





Figure 8. Programming using Java to calculate the factorial of a number to il- 


lustrate the use of common software constructs to solve problems independent 


of the hardware implementation. 


8. ‘To have any value, the outcome from 
the algorithm needs to be retained 
or have some kind of effect. 

All these and other principles can 
also be identified in cellular information 
processing. In the example in Figure 
8, we see how limitless cases could be 
solved by merely replacing the numbers 
i and n as needed. This works only if 
programming constructs such as itera- 
tion, assignment of values to variables, 
and so on, exist. Otherwise a unique 
mechanical arrangement would be 
needed to solve each example. It is this 
use of general-purpose symbolic logic, 
which can be mapped to mental or 
physical objects, that is so special about 
computers and cells. 

After this long, but necessary, prepa- 
ration, we are finally ready to examine 
three important topics in the art of 
designing software: generic software 
data structures; generic programming 
elements; and file formatting. These are 
fundamental for computers. 


I. Generic Software Data Structures. 
Let us examine how data is usually 
structured in modern computers and 
cells to facilitate use in general-purpose 


programming constructs discussed in 
section IT. 


Symbols in an alphabet 

Codes rely on an alphabet of elementary 
symbols. Modern digital computers use 
an alphabet of two symbols {0, 1} called 
bits. Cells use dozens of alphabets for 
their many codes. DNA is composed of 
four nucleotides abbreviated {A, C, G, 
T}, RNA also uses four nucleotides {A, 
C, G, U}, other codes rely on small ions 
such as cAMP (Ashcroft, 1997; Krysko 
et al., 2005) and calcium (Wagner et 
al., 2015), or on small parts of larger 
molecules. 

One or several symbols taken jointly 
define an item, field, constant, variable, 
or value. In the past, telegraph messages 
used 5-letter commercial coded values 
such as BYOXO (“Are you trying to wea- 
sel out of our deal?”) and LIOUY (“Why 
do you not answer my question?”). 
Other conventions also exist, such as 
LOL (“Laughing Out Loud”) and CU 
(“See You”). In the extended ascii ISO 
8859-1 code, ‘00001001’ represents a 
Line Feed, ‘01000001’ represents the 
letter A, and ‘00111000’ represents the 
decimal digit 8. The codeword length 


of values can be fixed as in the asci 
extended and the genetic code or have 
different lengths as in compressed codes 
to store and transmit electronic data 
(Togneri and deSilva, 2003). There 
are design trade-offs to consider when 
deciding whether to use a fixed or vari- 
able length (Truman, 2012). 

The symbols used by computer 
programs must be exact to be processed. 
Confirmation and Conformation are 
almost identical, but not the same. 

Different codes can be linked us- 
ing different alphabets. A sender code 
could be restricted to a symbol from, e.g., 
{green, yellow, red}, which the receiver 
could translate to its system, e.g., limited 
to {1, 2, 3}. 

When large molecules are used to 
convey coded meaning in cells, typically 
a small portion is informative, and the 
rest plays an adaptor molecule role or 
is used for the implementation details. 
Consider proteins. Portions of differ- 
ent residues are integrated to define 
a joint “symbol” having unique steric 
and electronic properties. The result- 
ing symbols must be decoded using 
three-dimensional processors. In the 
fluid environment of cells under vary- 
ing temperatures, the decoders must be 
more flexible than in computers. One 
consequence is that a portion of differ- 
ent amino acids could be combined to 
produce functionally the same symbol 
meaning in three dimensions. 


Data types 
Modern computer languages enforce 
data typing, which defines the kinds of 
values that can be assigned to variables to 
prevent errors. Common types include 
integer, floating-point number, charac- 
ter, alphanumeric string, and Boolean. 
Each kind of variable for biological 
codes is restricted to a range of values. 
The genetic code uses DNA and mRNA 
codons, whereas the enzyme complexes 
used by the histone code do not process 
codons (http://www.cellsignal.com/ 
contents/resources-reference-tables/ 
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histone-modification-table/science- 
tables-histone). 

As another example, many mRNAs 
can interact with only some miRNAs 
(which specify what is to occur to the 
mRNA; Verdel et al., 2009; Sugiyama et 
al., 2005). This corresponds to I:n, n:1, 
or n:m variable binding in Figure 3. In 
addition, only certain noncoding RNA 
data types (specific siRNAs, piRNAs, Alu 
RNAs etc.) are recognized by mRNA 
binding proteins. 


Data type subsets 
A subset of a data type can also be estab- 
lished for a specific program or module 
to further narrow acceptable values 
in some programming context—for 
example, only certain acceptable city 
codes for telephone numbers in a city, 
ora list of alphanumeric identifiers for a 
product line. We find this principle also 
in cells. The codons to represent Ala- 
nine must come from the subset {GCU, 
GCC, GCA, GCG} and Arginine from 
{CGU, CGC, CGA, CGG, AGA, AGG}. 


Operation are defined 
for each data type 
Specific computing methods or opera- 
tions are permitted for each data type 
(and also for complex structures like 
matrices, arrays, etc.). One can negate 
a Boolean variable to convert ‘True into 
False, but negating a data type “charac- 
ter” makes no sense. String variables can 
be concatenated, for example phrase = 
“white” + “” 
house,” but this won’t work for variables 


+ “house” to form “white 


such as integers. 

This principle is also found in cells. 
Each code used with DNA, RNA, pro- 
teins, sugars, or membranes is limited 
to its variable type(s) and their allowed 
operations. Consider the processing 
operations that can be performed with 
mRNA’s data type “codon.” The values 
can be read at the A (acceptor) or E 
(exit) portion of ribosomes (the receiver 
variables), they can be “concatenated” 
on each side to form polymers, and 


they can base pairs in unique ways (A-T 
and C-G). These kinds of operations 
cannot be assumed for other data types, 
such as hormones, transcription factors, 
or neurotransmitters. Ribonucleases 
and restriction enzymes can cut DNA 
strands using a subset of acceptable pat- 
terns (the receiver variable), but these 
locations are not processed on a codon 
basis as the genetic code does. 


Group item 

Elementary fields or items in computer 
languages can store values long-term 
using compound symbols. In many 
programming languages, several el- 
ementary items can also be combined 
and processed jointly for read and write 
purposes. As an example, a group item 
“address” could be composed of elemen- 
tary items “house-number,” “street,” 
“city,” and “country-code.” Additional 
hierarchical clustering is also used in 
computer languages (such as C, Pascal, 
and Cobol), meaning group items can 
be further combined into records for 
example. This principle is also found 
in data transfer conventions like XML. 

In cells, we recognize this principle 
whenever elements containing substruc- 
tures are processed as a complete entity. 
One example is telomeres at the end 
of chromosomes, composed of groups 
of repetitive nucleotide patterns (e.g., 
TTAGGG in vertebrates), which are 
replenished by the enzyme telomerase 
reverse transcriptase. The six individual 
nucleotides are processed as an en- 
semble. In S. cerevisiae, each C, ,A/ 
TG,_, repeat, taken jointly, constitute a 
potential binding site for Rap! proteins, 
which recruit additional proteins (Wil- 
liams et al, 2010). 

In mammals, shelterin protein com- 
plexes regulate telomerase activity. Two 
of the six subunits (TRF1 and TRF1) 
bind uniquely to individual double- 
stranded ‘T'TAGGG (de Lange, 2010). 
So once again we recognize the concept 
of a grouping of elementary components. 
Ata higher level, multiple copies of the 


individual patterns are treated as a new 
grouped entity and added all together to 
a chromosome by ‘TERT (‘TElomerase 
Reverse Transcriptases) using a piece of 
template RNA known as ‘TERC (Jady et 
al, 2006). 

Group items consisting of smaller 
group items are not limited to repetitive 
patterns. Multiple codons are placed 
together within exons, which them- 
selves are integrated into a primary 
RNA transcript. Processing as a whole 
occurs, such as in retrotranscription 
and rearrangements with the help of 
transposable elements. 

The concept of group processing 
reminds us of how several residues 
jointly lead to discrete motifs in folded 
proteins and how a larger numbers of 
residues work together to form secondary 
structures such as alpha helices and beta 
sheets. Different nucleotide combina- 
tions also produce special RNA motifs. 

Microbial genomes are also known 
to have an operon-like organization at 
various scalar levels (Audit and Ouzonis, 


2003). 


Concatenated index 
In relational databases such as Oracle, 
a unique combination of one or more 
index values can be used to identify data 
records. Similarly, multiple nucleotides 
define promotors to identify the location 
of genes. 


Array 
Arrays and linked lists contain a series of 
values. In arrays, values of some datatype 
are stored in numerically indexed posi- 
tions. The position within the array is 
informative and can be used directly in 
programming logic. Ifa certain value is 
always located at a specific index posi- 
tion (or a limited range of positions es- 
tablished in advance), it can be accessed 
directly by processing logic. An example 
using Fortran (a language well-suited to 
matrix calculations) is shown below (4). 
Assume that the results of a student’s dif- 
ferent exams are stored in known index 
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Figure 9. Nucleotide patterns at specific locations in bacteria define consensus promoter elements. The Pribnow box is 
centered at the -10, and a second component is often found at the -35 nucleotide position upstream from the start of tran- 
scription. Other regulatory elements are sometimes centered at the -41 or -61 position. If each nucleotide in the regulatory 


region is stored in an array, the index position can be used to program logical tests. 


positions of array Examresults, and index 
position 3 contains the points obtained 
for the math test. The programing logic 
might look like this: 
IF (Examresults(3) .GE. 
70 .AND. Examresults (3) 
.-LT. 85) THEN Mathgrade 
= ‘Bt (4 














Highly relevant to our discussion 
about cells, the value of interest could 
in principle be stored in different array 
positions if the acceptable alternatives 
are established in advance. Suppose 
there were two examiners and the result 
if determined by the first one is stored 
in Examresults(20) and if by the second 
examiner in Examresults(21). Now the 
program must determine the test results 
for the math exam by looking up the 
contents of array positions 20 and 21 and 
select the one having the exam result. 

Prokaryote promotors illustrate ar- 
ray data storage and processing. For the 
Pribnow Box, a six-nucleotide consensus 
TATAAT is used by E. coli, centered at 
the -10 position, and often a second pat- 
tern TTGACA centered at -31 (Figure 
9). For some bacteria or genes, the ar- 
ray positions to check could be slightly 


shifted, but legitimate indexed positions 
to be tested are known in advance. We 
will not elaborate here on the reasons for 
using alternative array positions, but it 
could be to regulate transcription rates or 
the results of genomic rearrangements. 

There are many more examples 
of array processing in cells. In a typi- 
cal ca. 22-nucleotide miRNA, usually 
only 6-8 adjacent or almost adjacent 
nucleotides (the seed region) at the 5’ 
end are relevant, which is also true of 
the corresponding receiver variable on 
an mRNA. Logical tests on candidate 
miRNAs and their binding sites can 
therefore be performed using array index 
values. As another example, the coding 
parts of DNA and mRNA specify amino 
acid sequences, and the nucleotides 
need to be processed as triplets with no 
frameshifts. This permits translation to 
read the codons located in sequential 
index positions along mRNAs. In other 
words, each array index position does not 
contain a nucleotide, buta codon. Once 
the mature mRNA is ready for transla- 
tion the length remains fixed, another 
characteristic of arrays. 

Additional examples of processing 
array data include the symbols used by 


mobile elements to recognize insertion 
motifs; the portions of folded ‘TT's that 
recognize cis-regulatory combinations; 
and the portions of enzymes that recog- 
nize restriction sites. 

We see why many proteins must fold 
reliably into the same three-dimensional 
structure. This brings the relevant 
elementary symbols together so each 
can be assigned to a three-dimension 
index, “protein_position|[i,j,k].” The 
relevant array positions refer to location 
in three-dimensional space and not the 
primary protein sequence. The resulting 
symbols need to be defined well enough 
to permit variables and their values to 
recognize each other, synergistically 
molding themselves together and avoid- 
ing false positives. 

Whenever for a DNA or RNA-based 
code the distance between key nucleo- 
tide patterns are exactly or almost exactly 
known (including epigenetically modi- 
fied nucleotides), then an indexed array 
seems to be a better description than a 
linked list. Knowing index values allows 
other array positions to be skipped and 
ignored. This is physically implemented 
in cells by constraining the decoders 
(e.g., portions of proteins) to specific 
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Figure 10. Linked lists and arrays. A. Double linked lists contain data (non-shaded 
boxes) and links (gray boxes), which point to the preceding and next member of 


the array. B. Arrays contain data at static locations identified by index values. 


ranges of distance and location between 
the relevant data elements. (In the case 
of linked lists, however, a more compli- 
cated search for the relevant variables 
must be implemented). 

We suggest below that DNA rep- 
lication and transcription processing, 
which are used by different codes than 
those just discussed, are based not on 
arrays but linked lists. There are subtle 
differences between these kinds of data 
structures. For example, in computers 
the length of an array is established 
when the array is created (unlike linked 
lists, which grow and shrink as needed). 
Remarkably, in cells the same nucleo- 
tides are sometimes used by different 
codes concurrently, each with different 
kinds of data structures. 


Linked List 

A linked list is a chain of data and link 
values. The data part contains the useful 
information, and the link has the address 
of the next or previous element. Single- 
linked lists only point to the address of 
the next element, whereas double-linked 
lists include pointers to the next and the 
preceding data location (Figure 10). 

Either an array or linked list could 
be used for programming purposes. They 


do differ, however, in internal imple- 
mentation in ways that affect execution 
speed of data insertion, deletion, updat- 
ing, and searching. One difference is 
that the index value where specific data 
is located in array lists is generally not 
known in advance and can change. Un- 
like arrays, linked lists can automatically 
grow and shrink dynamically as needed. 

To illustrate the difference, candi- 
date CRMs that could interact with the 
proximal module to regulate a gene 
are separated by distances that can vary 
(Figure 5). Finding the activated CRM 
requires a search for relevant data sym- 
bols whose positions are not defined 
by unique index values. An additional 
complication is that the regions of the 
CRM that are to bind to the proximal 
module involve CRE's whose positions 
are not static in three dimension and 
must also be searched for. 

The same reasoning applies when 
spliceosomes identify variable intron 
content whose boundary is defined by 
splicing signals (Rino and Carmo-Fon- 
seca, 2009). The introns are generally 
not identifiable a priori by fixed index 
positions and the spliceosome succeeds 
even if transcription error adds or elimi- 
nates nucleotides. 


In linked lists, elements of a defined 
data type (which could be a complex 
group of different item types) can be 
added to the end, inserted at any posi- 
tion, modified or removed (for arrays 
also, but that requires more processing 
effort). In addition, another linked list 
can be added on to another at any posi- 
tion. One disadvantage, of course, is that 
more effort is required to find a specific 
value compared to when its indexed 
location is known in advance. 

In RNA, the four nucleotides {A, 
C, G, U} are attached to riboses (and 
deoxyriboses for DNA), which are held 
together along the backbone by phos- 
phate groups (Figure 11). 

Analogous to linked lists, nucleotides 
can easily be added, removed, or in- 
serted simply by breaking and reattach- 
ing “address pointers,” here phosphate 
bonds. This is an excellent description 
of what happens when DNA chains 
replicate one base after the other, RNA is 
transcribed, introns are removed, exons 
are spliced together, and chromosome 
crossover occurs. Absolute index values 
per se are generally not relevant for the 
logic processing, unlike for arrays. 

We summarize in Table I some of 
the built-in methods available to linked 
lists, using the Java language (https:// 
docs.oracle.com/javase/7/docs/api/java/ 
util/ArrayList.html) and include some 
examples from cells. 

In many cases, the processing could 
be defined in terms of linked lists and/or 
arrays. Let us recall miRNAs and take 
into account the concept of sublists, or 
relative indices, mentioned in Table I. 
In processing step 1, the nucleotides of 
a candidate miRNA could be assigned 
to a sequential linked list. In processing 
step 2, sliding windows 6-10 nucleo- 
tides long (representing candidate seed 
regions) could be fed into a fixed-length 
array. The values in array position[0] 


... position[9] would then be system- 


atically tested against possible acceptor 
variables in mRNAs. Multiple hits are 
allowed. 
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AY Kara stoye| Meaning 


Add() 


Appends an element to the end or inserts at a specific position. 


Cells: RNA transcription; some forms of RNA editing can insert codons (Bass, 2002; Nishikura, 2010); removing introns 
and splicing exons together; replicating DNA; chromosome cross-over. 


Clear() 


Removes all of the elements from a list. 


Cells: Upon degrading RNA all resources are free to be used for other purposes, unlike for arrays which when empty still 
consumes computer memory. 





Contains() Returns true if this list contains the specified element. 
Get() Returns the element at the specified position in this list. 
IndexOf() Returns the index of the first occurrence of the specified element. 


‘To identify introns, a primary transcript is searched to identify where it starts and ends to identify the index values. Intron 
lengths can vary considerably. Automated algorithms, such as SplicePort (http://spliceport.cbcb.umd.edu/) and Gene- 
Splice (http://ccb.jhu.edu/software/genesplicer/) reflect the logic used in eukaryote cells to identify splice sites. 


The same concept is found in DNA in which transposable elements can be removed from genomes using patterns that 
define where they begin and end (van de Lagemaat, 2005). 


Other examples include: the initiation codon on mRNA is searched for (and modified) and so is the region on mRNA at 
which to create polyadenylation tails; patterns on mRNA are also searched for where nucleotide posttranscription modi- 

fications are to occur. The CRMs (Figure 5) are of variable distance from each other (e.g., after insertion of transposable 
elements into DNA) and need to be found. The location of elementary symbols for activated CRMs can also be variable, 





depending on what TF’s are bound and which ligands these ‘TT's contain. 








Table I. Some in-built methods used with linked lists in object oriented programming languages like Java and examples 


from cell biology. 


Variables as a Data Structure? 
We mentioned that in programming, 
arrays and linked lists are used to store 
data values. These can be assigned to a 
variable. For example, for an employee 
stored in index position 45 we might 
have a line of programming such as: 
SalaryInDollars = SalaryInPesos(45) * 
1.4, and there is no ambiguity in how 
the value assignment occurs, nor in what 
was assigned to the variable “SalaryIn- 
Dollars.” Sometimes this is also true in 
cells. The anticodon of a specific tRNA 
is fixed, and the value of the commu- 


nicated charged amino acid is exactly 
specified. But in cells this is not always 
that straightforward. It would be as if 
the variable SalaryInDollars could have 
small physical differences that affect 
how it interacts with the array positions, 
leading to significant effects. ‘This issue 
can also apply to variable assignments 
that do not involve arrays and linked lists. 

Unlike computers, cells often use 
variants of variables that do not respond 
identically to the same values. For ex- 
ample, a CRE is like a sensor, a variable 
that can be assigned values such as “TF 


bound” or “no TF is bound.” However, 
the binding sequence of a particular 
CRE can vary and therefore respond dif- 
ferently to an identical TF (which itself 
can provide many values). ‘This can have 
serious consequences, affecting how fast 
and long binding occurs, and could even 
affect the subsequent Boolean logic. (For 
example, a modified CRE might affect 
the geometry of the bound TF and thus 
how it interacts with other factors.) 
This suggests a novel technical 
inspiration for computer scientists and 
bioinformatic researchers. Instead of 
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AW Kara stoye| Meaning 


RemoveAll () 


Removes from the list all occurrences of specific values. 


Cells: Examples include tRNA splicing (Trotta et al., 1997) and RNA self-splicing (Cech, 2002) based on secondary or 
tertiary structure. These rely on discrete structures which can be stored as structured (i.e., multisymbol) values in individ- 
ual linked list positions, which is a different operation than removing whatever is found between two boundary patterns. 
This assumes a specific code is to work with the linked list. 


Note: Gene silencing mechanisms are not the same as physical compacting through physical removal. 








RemoveRange() 


Removes the elements whose index is between two specified indices. 


Cells: After the index location of intron/exon boundaries are found, the introns can be removed from primary transcripts. 





Set() 


Replaces the element at positions that need to be specified with a value. 


Cells: Error correction mechanisms use a DNA or RNA template; any process which modifies a DNA nucleotide (like 
methylation) or RNA codon, including RNA editing (Bass, 2002; Nishikura, 2010). 





SubList() 





Sublist data structures are a feature of linked lists and arrays. Logic processing is 
performed with respect to the sublist and its own indices, for which the first one 
is assigned an index value 0, the second 1, etc. All operations performed on the 

sublist are reflected in the original full list. 


Cells: The seed region within miRNAs. In addition, many of the examples above rely on first identifying the location of 
boundaries; what is relevant thereafter are the relative positions. 








Table I (continued) 


treating members of a class of CREs 
as functionally identical or as separate 
variables —as we have been implying so 
far—the suggestion here is to develop 
a fuzzy-logic type technology which 
permits both variables and values to 
be processed with variability. Finding 
cellular variables would then also 
use linked arrays, since the candidate 
regions and length would be unknown 
in advance. The imprecision of many 
bioinformatic software tools to identify 
regulatory patterns reflects these joint 
uncertainties. 


Here is an example. RNA polymerase 
and ‘TFs search for DNA (response ele- 
ments, or sensors) in 100-1000 base-pair 
regions upstream from the transcrip- 
tion start site and on the same strand. 
Nucleotide positions are indexed with 
negative numbers counting back from 
-] towards the 5’ direction. The patterns 
to test are variables that are not always 
the same in location or details, which 
is where linked lists become useful. In 
focused initiation, transcription starts at 
a single nucleotide or within a narrow 
region of several nucleotides having 


sequence motifs such as the TATA box 
and DPE. In dispersed initiation, there 
are multiple weak start sites over a broad 
region of about 50 to 100 nucleotides 
(Juven-Gershon and Kadonaga, 2010). 
This suggestion captures those cases 
where symbols seem to have both variable 
and value character. The regulatory por- 
tion of genes define variables that need 
data to know when and where to initiate 
transcription, but simultaneously RNA 
polymerase and ‘TF’s sometimes also 
provide variables that need data to know 
where to attach in the promoter region. 
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phosphate 


Figure 11. Structure of RNA. The four nucleotides are defined by whether the 
base A, C, G or U is attached where the R group is shown. 


Il. Generic Programming Elements 
Modern computer languages use some 
standard constructs to express what is 
to be done. Often the same logic can 
be reused, and new values only need to 
be assigned to the variables. We will dis- 
cuss the main ones used to implement 
processing in computer and cellular 
programming. 


Assign Values to Variables 
Kirschner and Gerhart noted that in- 
formation is used by cells to respond 
to changing circumstances. They wrote, 
“Two extreme views of information 
transfer have always existed in biology, 
the permissive and the instructive. The 
distinction comes up whenever there is 
a stimulus and response, or more gener- 


ally a cause and an effect ... Watering 
a seed provides a stimulus, but it is a 
permissive input, since no one would 
assume that the water falling on the 
seed instructs the seed how to germi- 
nate into a plant” (2005, p. 125). We 
believe their intuition refers to values 
(provided by the stimulus) and variables 
(which generate a response upon pro- 
cessing with the assigned value). The 
cascade of steps to be executed —after, 
for example, sensing moisture —must 
already have been prepared and an- 
ticipated at the receiving end. The 
variables patiently wait until activated 
by informative signals. 

Programs and subroutines use vari- 
ables restricted, as we mentioned, to a 
relevant data type, to which different 


values can be assigned every time the 
program is executed. ‘To illustrate, price, 
disent, p, d, and newpri are variables in 
this Fortran-like programming code. 


price = 100 (5) 
discnt = 5 
call calcl(price, dis- 





cnt) 
subroutine calcl(p, d) 
newpri =p -d 


Values have been assigned or are cal- 
culated. Here price and p have the same 
meaning, and two coding conventions 
are linked by associating a variable from 
the calling program to one used within 
the receiving subroutine calc]. 

How do variables relate to the dis- 
cussion on symbols, data types, subsets, 
and operations above? In computer pro- 
grams, variable names and their values 
are constructed from one or more fun- 
damental symbols. The variable price is 
defined by combining several symbols 
from the relevant ASCII alphabet and 
is treated as a unique entity. The symbol 
combination ecirp, however, has not 
been assigned a meaning in (5) and is 
not a valid variable in this program. The 
value 100 assigned to price is also com- 
prised of several ASCII symbols, which 
taken together have a unique meaning, 
but assigning price = e34/$![ makes no 
sense, being outside the relevant data 
type. An operation newpri = "TRUE /45 
is also not legitimate, not being a valid 
operation of that datatype. 

Through such precise software 
conventions, programming errors can 
be avoided and action to perform ex- 
pressed unambiguously. However, if 
the semantic meaning of the variables is 
not known, the ultimate intention and 
results might never be fully understood. 
What if the source code is not available 
at all but only the executable program? 
By empirically testing variable values, 
the hidden Boolean logic can still be 
discerned in principle by the results, an 
important observation when reflecting 
on cells. 
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How does all this work with cells? 
To understand cellular logic, one must 
identify four players: the sending coding 
system; the receiving coding system; 
and, for both, what the variables are 
and what provides their values. What 
is a variable? It is the biological recep- 
tor or sensor able to assume alternative 
values (including a simple “bound”/“not 
bound” state), which, once activated 
with a value, leads to a relevant biologi- 
cal response. 

Variables are composed of a single 
symbol or of elementary symbols com- 
bined in a unique manner (in computers 
and cells). Defining variables is neces- 
sary to program intention, and cellular 
variables are identifiable by humans and 
cellular decoders. 

The A site of a ribosome is a receiver 
variable (Figure 2), able to accept as val- 
ues any of the 64 codons or to be empty. 
To work properly at the ribosome, not 
any codon will do. It must only accept 
the value transferred by a specific sender 
variable, which is associated with the 
relevant mRNA. 

As another example, the location 
on a template DNA being currently 
processed bya DNA or RNA polymerase 
is a sender variable whose current value 
is one of the four nucleotides to be com- 
municated to a polymerase decoder. 
At the end of the growing chain, part 
of the polymerase defines a receiver 
variable, which needs to know which 
nucleotide is to be added (the receiver 
value) (Figure 3). 

The general pattern should now 
be clear. Special locations on sugars, 
membranes, or proteins are variables 
that can accept values (ligands or noth- 
ing bound), for example, in the histone 
code. The enzymes that methylate the 
appropriate histone residue can have 
many variables of their own—used to 
first perform their own internal logic— 
and then a sender variable is assigned a 
sending value, the ligand it will transfer. 
Recall that a chain of sender/receivers 
can be set up. 


The discussion above may have 
suggested that only a few elementary 
symbols are used along with a handful 
of values for variables. Unlike comput- 
ers, which use only elementary 0/1 “bits” 
grouped into a relatively modest number 
of unique ASCII symbols, in cells vari- 
ables and their values rely on different 
and more complex alphabets for dif- 
ferent codes, using many elementary 
symbols having distinct geometric and 
electronic properties. 

With computers, hardware design 
is simpler and more reliable if the va- 
riety of elementary symbols (bits) and 
grouped symbols like ASCII letters are 
restricted. Many of the cellular codes, 
however, must support a far more nu- 
anced behavior (recall our comments on 
fuzzy variables and fuzzy values). A very 
large number of elementary symbols are 
used, each having three-dimensional 
electronic and geometric features (as 
when portions of amino acids within pro- 
teins are combined in TT's). This permits 
theostat-like or fuzzy-logic outcomes, 
which can be fine-tuned dynamically. 

To illustrate, not only can different 
combinations of amino acids define the 
same kind of TF, but nearby attachments 
and physical conditions like temperature 
and salinity can affect the quantitative 
value that gets interpreted once bound to 
a CRE. Fine differences in the topology 
of the same kind of CRE —even those 
having identical nucleotides—can also 
lead to quantitative differences upon 
interacting with a seemingly identical 
TF. This is important to understand how 
codes can interact synergistically. They 
can modify the physical geometry of the 
compound symbols used by other codes. 


Assign a Value to a Constant 
Values of variables could change very 
often during execution of a program, 
such as the next nucleotide value to be 
processed by a polymerase. Programs 
also benefit from using constants, which 
during a relevant time period should not 
change. Implicit in cellular logic pro- 


cessing are many constants, such as the 
temperature, amount of energy provided 
by an ATP molecule, which ensemble 
of genes are up- and down-regulated for 
a cell type, and genomic imprinting (in 
which certain genes are expressed in a 
parent-of-origin-specific manner). 


Boolean Logic 

The ability to use IF-THEN-ELSE type 
logic adds immense value to program- 
ming, and to understand cellular logic, 
one must identity what is the variable 
being tested and what provides its values. 
Between 5% and 10% of protein-coding 
genes in most organisms encode a TF 
(values for CREs), and these can have 
multiple binding domains. Only three 
kinds of domain are known: cold shock, 
helix-turn-helix (HTH) type 3, and HTH 
psq (Walhout etal., 2013, p. 67). Interac- 
tion of only portions of a domain with a 
CRE or other biochemicals define the 
values (Figure 12). 

Example (6) illustrates in program- 
ming terms the kinds of logic performed. 





if (CRE 1 = ‘val 1’) 

{do this} (6) 
else if (CRE 1 = 

‘val 2’) {do something 
else} 

else if (CRE 1 = 
‘val _ 3’) {do the follow- 
ing} 

else {do nothing, or 





continue what you are 
doing... whatever makes 
sense} 


Checkpoint ifthen logic occurs 
throughout every step of the cell cycle 
(Shapiro, 2014) checking for genome 
damage (Ishikawa et al., 2006), nutri- 
tional status (Searle etal., 2011), progress 
of replication (Segurado and Tercero, 
2009), DNA replication (Putnam et al., 
2009; Nguyen etal., 2010), DNA damage 
(Huen and Chen, 2010), chromosome 
alignment on the spindle pole (Nezi and 
Musacchio, 2009; Musacchio, 2011), 
spindle orientation (Caydasi etal., 2010), 
telomere capping (Ciapponi and Cenci, 
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Figure 12. Transcription factors possess DNA-binding domains (solid black), 


only portions of which provide the values for receiver variables (the appropriate 


cis-regulatory elements). 


2008), cell size (Fang et al., 2006), and 
whether the cell has accumulated the 
necessary components needed by the 
daughter cells (Sabelli et al., 2013). 

Errors would lead to serious conse- 
quences. Instead of genome repair in 
response to DNA damage, the ifthen 
logic could lead to programmed cell 
death (apoptosis) (‘Tentner et al., 2012; 
Walsh and Edinger, 2010; Engelberg- 
Kulka et al., 2009), using some intercell 
molecules as “death factors” (Holoch 
and Griffith, 2009) or to a decision to 
halt the cycle and initiate very sophisti- 
cated repairs (Song, 2007). 


Iteration 
Iteration loops are often used in pro- 
gramming to ensure the correct number 


of repetitions. An “infinite loop” would 
consume a computer’s—and cell’s —re- 
sources and must be prevented (Figure 
13). 

Various repetitive processes occur 
in cells under the careful regulation of 
Boolean decisions: many RNA copies 
are produced from a single gene; many 
protein copies are made from a single 
mRNA; many copies of key biochemi- 
cals are synthesized, such as amino acids, 
tRNAs, hormones, ATP, antibodies, etc.; 
each codon position on mRNAs must be 
processed; flagella must rotate enough 
times but not continually; tubulin copies 
are polymerized to form long microtu- 
bules; enough recursive interactions 
having the right parameters must be run 
to produce steady-state genetic regula- 


tory circuits; and many copies of each 
cell type are produced in eukaryotes. 
There are many more examples, 
recognized whenever a cyclic behavior 
is observed having feedback control. 
Examination of molecular machines 
reveals that this is a general principle. 
Controlling iteration, defining the con- 
ditions to use, when to start, and when 
to terminate, must be implemented si- 
multaneous with the iterating processes. 
Runaway production would be deadly. 
Remarkably, this applies not only to the 
operation of molecular machines but 
also to the process to create the right 
number of them also, according to cur- 
rent cellular need. Structuring data into 
datatypes like arrays and link facilitates 
the use of iterations in programming. 


Control Structures 

Programs use techniques to control 
what is to be done, when, where, how, 
and how often. In cells, we find many 
examples. We discussed iteration already. 
Boolean logic is used with the binding 
state of cis-regulatory elements (CRE) 
such as enhancers, silencers, and insula- 
tors (Kolovos et al., 2012; Capelson and 
Corces, 2004) to regulate genes precisely, 
in a manner unique to each cell lineage 
(Davidson, 2006). The logic is often 
very complex. Suites of cis-regulatory 
modules (CRMs) (Figure 5) can regu- 
late multiple genetic loci distributed 
throughout the genome, establishing 
network circuits sometimes called 
“regulons” or “cis-regulatory networks” 
(Dufour et al., 2010). 

The combinatorial potential through 
binding various T's permits a vast range 
of regulatory possibilities, able to engage 
in sophisticated molecular computa- 
tions (Shapiro and Sternberg, 2005; 
Davidson and Erwin, 2006). Because 
the underlying physical interactions 
are weak, the components can form 
and dissociate rapidly to permit quick 
responses to signals received. Complex 
computations using weak interactions 
to form novel circuits is also typical of 
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Figure 13. Iteration loops are common in computer and cellular programming. 


Conditions are tested to determine when to initiate an iterative process and when 


to repeat or terminate it. 


how neurons are wired (London and 
Hausser, 2005; Sidiropoulou et al., 2006; 
Markram et al., 2015). 

Computer programmers can use 
“GoTo” type commands. Special signals 
are ubiquitous in cells, which specify 
where molecular machines and com- 
ponents are to act, i.e., which organelle, 
subcompartment, or location on a 
membrane. Causing instructions that 
are stored elsewhere to be executed 
goes by names such as functions, meth- 
ods, procedures, and subroutines in 
computer programming. In cells, there 
are many examples, such as activating 
hox genes to regulate expression of 
many genes as a modular ensemble and 
activating key ‘TFs to generate genetic 
regulatory genetic circuits (Davidson, 
2006). Remote processing is often en- 
capsulated in various subcompartments 
and organelles. We recall that DNA is 
also present in plasmids, mitochondria, 
and chloroplasts, not just chromosomes. 
These decisions also require the use of 
variables. 

Another technique used by computer 
languages is the idea of “sleep” or “wait” 
for a fixed or variable time period. We 
find many examples in cells, such as 


feedback inhibition in enzymatic net- 
works, gene deactivation, and placing 
the cell cycle on hold. 


Other Non-Prescriptive Processing 
Most of what happens in computers 
results from explicit instructions, but 
our analysis of coded information sys- 
tems clarifies that additional physical 
constraints are also always incorporated 
to ensure the intended outcomes. There 
are design trade-offs, whether to guide 
intention as coded messages or in a hard- 
wired physical manner. A computer ex- 
ample is when printed paper falls into a 
tray with sides that hold them in place. A 
considerable amount of cellular success 
is based on pure physical-chemical fac- 
tors that have been carefully organized, 
a topic we discuss in Part 2. 


Computer programs read, write, and de- 
lete to long-term and short-term memory 
devices. The codes found in cells must 
be able to read and write data values. 
Setting epigenetic tags are examples of 
medium and long-term write operations, 
which serve to communicate intended 
outcomes later. DNA is usually thought 


of as a fairly permanent source to read 
data from, but DNA can be added to 
a genome via CRISPR (Zetsche et al., 
2015; Ran etal., 2015; Gen News High- 
lights, 2015), reverse transcription (e.g., 
telomerase reverse transcriptase that 
maintains the telomeres of eukaryotic 
chromosomes), transfer and acquisi- 
tion of new genes via integrons coding 
cassettes (Hall and Collis, 1995), and 
different lateral gene-transfer mecha- 
nisms, including transfer of plasmids, in 
prokaryotes. Inteins are another mecha- 
nism. These are self-splicing portions of 
proteins with homing endonuclease abil- 
ity that snip parts of DNA so that a copy 
of the coding sequence of the intein can 
be inserted there (Gogarten et al., 2002). 

DNA can also be modified in other 
ways. DNA segments such as transpo- 
sons can be transferred to other sites on 
the genome, and “shufflons” can invert 
sections of DNA, for example, to replace 
part of a coding strand with its comple- 
mentary strand to create modified pro- 
teins (‘Tam et al., 2005; Komano, 1999). 


Multiprocessing and Threading 
Modern computer hardware and soft- 
ware designs can parallelize computa- 
tions, permitting multiple tasks to be 
carried out simultaneously. This is 
common in cells, such as in the parallel 
production of ATP from many mito- 
chondria; translation of several identical 
mRNAs in parallel (several ribosomes 
can also translate the same mRNA si- 
multaneously), transcription of multiple 
copies of the same gene, the existence 
of many cells of the same kind, and the 
presence of multiple copies of the same 
subcompartments and organelles. 


Reuse of Modules 
In good software design, the same 
general-purpose modules, methods, and 
procedures are often reused. Acommon 
approach is to separate identical portions 
of coding into smaller modules that 
can be invoked from within overarch- 
ing modules. This modularity is found 
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also in cells. As Kirschner and Gerhart 
pointed out (2005, p. 137), “lhe same 
pathways are used over and over again 
within the same organism for different 
purposes. Thus, they must be modified 
slightly to interact with a variety of pro- 
cesses and to work in different environ- 
ments and cell types.” They describe the 
interactions as “weak linkages,” which 
we recognize as simply variables or 
parameters used to link subprocesses in 
different manners. 


Interchangeable Libraries 
In addition to invoking subroutines, sec- 
tions of computer code such as classes 
are often imported from a library. Simi- 
larly, prokaryotes in particular exchange 
genetic material through horizontal 
(lateral) gene transfer (Thomas and 
Nielsen, 2005; Ochman et al., 2000; 
Koonin, et al., 2001), whereby genes, 
plasmids, and so called “islands” en- 
coding specialized adaptive functions 
are exchanged (Dobrindt et al., 2004). 
This permits a huge amount of coding 
to be distributed in the environment and 
put to use rapidly when the need arises, 
facilitating adaptability. This is a form of 
open systems design. Genetic material 
can also be transferred into eukaryotes 
through vectors such as viruses. 


Ill. File formatting 
Shapiro and Sternberg (2005) drew 
attention to the parallels between com- 
puter file formatting and data storage 
in cells: 
The explicit parallel with electronic 
data systems indicates that the ge- 
nomic storage medium has to be 
marked, or formatted, with generic 
signals so that operational hardware 
can locate and process the stored 
information. (Shapiro and Sternberg, 
2005) 

Data storage can be organized physi- 
cally in computer and cell technologies 
using principles such as sectors, disk par- 
titioning, and data segments, discussed 
in Part 2. On top of this infrastructure, 


software programs organize different 
data using file formatting. A program 
that interacts with specially structured 
file data must be able to access it cor- 
rectly, even though the location of the 
content could be scattered all over the 
physical medium. DNA, RNA, and 
proteins are used as read/write/delete 
storage devices and need to be properly 
formatted so the corresponding “reader” 
will work. 

The metadata contained in a com- 
puter file header can be stored at the 
start, end, or other areas of the file. 
Likewise, in DNA, RNA, and proteins 
formatting instructions need not be 
found in only one location. Given the 
existence of multiple codes, DNA “files” 
are formatted for use in different man- 
ners, depending on the program being 
used. The various ways DNA are packed, 
such as by nucleosomes, determine 
which genes can be processed. Preparing 
portions of DNA for processing by DNA 
polymerase (to identify the starting and 
end points, open and unwind the strands, 
remove bound histones, etc.) is very 
different from the formatting details — 
which occur in three dimensions—for 
RNA polymerase. The programs that 
perform DNA error corrections also 
require their own formatting rules. 

Epigenetic tags are often used to 
identify what data to process and how. 
Adding and removing these ligands from 
DNA, RNA, and proteins is an example 
of preparing files for processing and 
must be carefully regulated. Histone 
modifications define which portions of 
DNA can be processed. Over a hundred 
posttranscription modifications have 
been identified in all three major RNA 
species (tRNA, mRNA and rRNA), as 
well as in other families of RNA such as 
snRNA (Cantara etal., 2011). Examples 
of formatting specifications in DNA 
include the use of methylation and de- 
methylation (Bird, 2002; Paszkowski and 
Whitham, 2001), binding of TFs (Cheng 
etal., 2012; Davidson, 2006), and rules 
to identify exons (Harrow et al., 2009). 


Individual eukaryote mRNAs are 
formatted as individual files with begin- 
ning and ending metadata in the form 
of 5’ capping and 3’ polyadenylation, 
attached miRNAs, and so on. This is 
necessary to ensure the ribosome pro- 
gram will work properly. Different sets 
of formatting rules are necessary for 
different programs such as separation of 
introns and exons by the spliceosome or 
to degrade RNA. 

Formatting on proteins is common. 
Posttranslation modifications (P'TM) 
include methylation, phosphorylation, 
acetylation, ubiquitylation, glycosyl- 
ation, and sumoylation (Strahl and Allis, 
2000; Jenuwein and Allis, 2001). Struc- 
tural three-dimensional recognition 
features, generated with alpha coils, beta 
sheets, disulfide bonds, hydrophobic 
patches, and other features also ensure 
correct formatting of proteins. In cells, 
all this is precisely regulated, often 
down to the atomic level. Reversible 
phosphorylation, the most widespread 
PTM, occurs on the correct atom of a 
serine, threonine, or tyrosine residue 
to form phosphomonoesters or on his- 
tidine, arginine, and lysine residues to 
form phosphoramidates (Cie la et al., 
2011), all according to the particular 
code involved. Recalling the existence 
of signal cascades and enzymatic net- 
works, proteins are also carriers of data 
values that get processed by other sensors 
(variables to be assigned values). DNA 
and RNA are not the only information 
carriers in cells. A modification on a ‘T'F 
can become a data setting to be used by 
the receiving portion of asecond TF. For 
these reasons we see that proteins can be 
formatted and classified into different 
“file types.” 

Copies of tagged proteins, RNA, 
and DNA (like nucleosomes) can be 
inherited by somatic daughter cells, and 
sometimes the tag is removed from the 
daughter cell, generating an empty or 
partially empty “file” that can be written 
to. In the same way that a program like 
Excel cannot process a jpg file, each 
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of the cellular information “readers” 
process only the data specially format- 
ted for it. 


Compressed Archival 

DNA is compressed and protected for 
future use by winding sections of ~ 147 
base pairs around a core of 8 positively 
charged histone proteins into nucleo- 
somes, and then further compacting 
the nucleosomes into higher order 
chromatin structures complexed with 
protein and RNA (Jenuwein and Allis, 
2001). The portions of DNA that need 
to be expressed must be unpacked and 
reformatted properly. The cellular goal 
is to save physical space and protect the 
medium from degradation. Bacteria 
also quickly lose DNA not immediately 
needed (and can regain genes via lateral 
gene transfer), which saves precious raw 
materials and energy. Computer analo- 
gies of the principle include programs 
like zip and the export of tables by a 
database management into a single 
export file, all or parts of which can be 
retrieved and properly structured for 
use later. However, computers use algo- 
rithms that recode the original content 
using fewer bits, a principle not known 
in cells. Inspired by cellular compres- 
sion, which transforms essentially linear 
DNA into three-dimensional storage, 
engineers might consider designing 
mass storage devices to also store data 
that cannot be used immediately as 
is but, like packed DNA, could be 
reopened when needed. 


Summary and Discussion 
Recognizing cells as information pro- 
cessing devices is the proper way to un- 
derstand their holistic intent and design. 
In fact, Gatlin (1972, p. I) defined life as 
an “information processing system,” and 
Britten (Britten, 2003, p. 82) pointed out, 
“We cannot start with DNA and grow a 
cell because there must be an adequate 
initial state ofa cell with a vast multitude 
of details under control.” We mentioned 


above that cellular information is 
partially distributed hierarchically and 
recognize that there are many carriers 
in the lower, embedded levels. An organ 
consists of many cells, each of which 
contains many mitochondria, and so 
on. In large populations of prokaryotes, 
the logic processing is distribution over 
many interacting species to form a viable 
ecology, whereas in complex eukaryotes 
considerably more is concentrated 
within the individual organism. In virtu- 
ally all biochemical processes, one sees 
strong regulation unless the process is 
malfunctioning, as in cancerous growth 
or viral infection. In other words, there 
are always sophisticated rules for when 
to begin and countermeasures that pre- 
vent runaway processing. 

Regulation is best designed and 
interpreted using purely formal rules, a 
key feature of software engineering. If, 
for example, a metabolic chain requires 
feedback control to a preceding enzy- 
matic reaction, this can be analyzed and 
expressed symbolically, along with the 
mathematical specifications and control 
tules. To instantiate the requirements, a 
physically viable solution then needs to 
be implemented. No rational engineer 
or programmer would think of develop- 
ing programs by letting rules and their 
implementation pop into existence ran- 
domly without any conceptual guidance. 

We saw that conceptual software 
elements such as iteration and control 
structures are developed on top of 
data types—each with their unique 
properties — organized into variables, 
arrays, and linked lists and all this using 
well-defined file formatting to facilitate 
processing by molecular machines. 
Many independent codes found in cells 
make use of these principles. It is hard 
to overstate how important variables are 
in cellular processes to permit regulation 
and maximum adaptability. The loca- 
tion, timing, and amount of transcrip- 
tion by RNA polymerase is defined by 
CREs (promoters, enhancers, silencers, 
insulators; Kolovos et al., 2012) and 


termination by terminator sequences 
(Ishihama, 2000). 

It would require many volumes to 
describe in detail the formal control 
structures used by other cellular activi- 
ties, such as homologous chromosome 
crossing-over, VDJ recombination in 
the immune system, nonhomologous 
end-joining (NHEJ) of broken DNA 
ends, DNA transposons (self-insertion, 
excision), telomerase extension, chro- 
mosome segregation, DNA compaction, 
binding sites affecting DNA spatial or- 
ganization into transcription factories in 
the nucleus, signals for error correction 
and damage repair, and the multitude 
of other cellular processes. 

There is considerable evidence 
that damage through random changes 
is actively hindered in cells, such as a 
bias for many retrotransposons to insert 
upstream of transcription initiation sites 
(Shapiro and Sternberg, 2005), which 
prevents damage to coding sequences 
and enhances the potential for a con- 
structive regulatory change. Very often 
the regulatory logic makes sense to 
humans skilled on symbol logic, but the 
details are different across taxa and did 
not originate from a common ancestor. 
An example is the signal used in E. coli 
to repress catabolism (the CRP palin- 
dromic binding site for the CRP-cAMP 
complex), which is unrelated to that 
found in Bacillus subtilis (CRE element 
recognized by protein CcpA) (Miwa et 
al., 2000). 


Coded Systems Can Interact 
Although the various codes operate 
independently in cells, they can col- 
laborate to ensure a fine-tuned outcome. 
We mentioned epigenetic codes, which 
modify gene expressions, and another 
code based on TFs bound to CREs, 
which also regulate gene expression. 
But in addition, a different code based 
on adding and removing ligands—es- 
pecially phosphate groups— modify the 
TFs themselves (Shapiro 2006). Further- 
more, ‘TF half-lives are also regulated 
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by the NEnd code. Gene expression is 
further affected by other codes which 
use various classes of RNAs (siRNA, 
snoRNA, miRNA, etc.) that modify 
chromatin accessibility, transcription 
initiation, transcription elongation, RNA 
processing, RNA stability, and mRNA 
translation (Mattick and Makunin, 2006; 
Taft et al., 2010; Storz and Wassarman, 
2005). 

By integrating multiple codes, cells 
become highly responsive to what is 
going on throughout the entire cell and 
their external environment. The design 
requirements would be overwhelming 
for humans. The same stretch of DNA 
can be used as variables for some codes 
(e.g., CREs, methylation binding loca- 
tions, and after transcription to locate 
regions on mRNA for miRNA binding 
and to specify intron/exon boundary 
locations) while simultaneously provid- 
ing the data values for other codes (e.g., 
as codons after transcription and as a 
template for new DNA copies). 

These requirements demand formal 
specifications to satisfy all requirements 
and to define what is to be done by each 
code. Synonymous coding from the 
point of view of the genetic code must 
identify protein sequences while simul- 
taneously controlling translation rates 
within regions of mRNA. The DNA- 
to-RNA conversion code during tran- 
scription also needs to control stalling 
of mRNA precursors for spliceosomes 
for purposes of siRNA accumulation 
as part of a host’s defenses to damaging 
transposons (Dumesic et al., 2013). 

Collaboration between coding sys- 
tems is sometimes linked directly. The 
histone modifications, which involve 
over 100 protein readers, writers, and 
erasers (Carey, 2012, p. 72, 224), some- 
times develop protein complexes that 
include the enzymes that methylate 
CpG motifs on DNA (DNMT3A and 
DNMT3B) in the same region the 
histone is located (Carey, 2012, pp. 
73, 89-90). This is another example of 
instantiation using adaptor molecules. 


The reverse is also true. The DNA 
methylation code can affect the histone 
code in a synergistic manner. Meth- 
ylation attracts more repressive histone 
modifying enzymes (Carey, 2012, pp. 
224-226). Similarly, long ncRNAs locate 
near imprinted genes (which identify 
whether coming from the mother or 
father), and these can recruit epigenetic 
enzymes such as G9a or EZH2, which 
puta methy] tag on lysines K9 and K27of 
histone H3 (a second code) to enhance 
the imprinting (Ikegami et al., 2009). To 
complicate the picture, long ncRNAs 
can increase or decrease expression of 
target genes for reasons not understood. 

The miRNA code also interacts with 
enzymes involved in epigenetic codes by 
regulating their effective concentration 
(Carey, 2012, pp. 231-232). 

Stem cells express a very different 
set of proteins than differentiated lin- 
eages. Not only are different genes de- 
activated by blocking TFs bound in the 
cis-element region, but also a different 
set of miRNAs are switched on (a second 
code) to help identify and degrade the 
mRNAs no longer needed by that class 
of cells (Pauli et al., 2011). 

Chemotaxis (ability to swim toward 
nutrients and away from noxious stimuli) 
uses two codes in E. coli to respond to 
more than fifty substances. In the first 
one, there are four kinds of receptors 
on the membrane that respond to the 
environment by phosphorylating the 
communication protein CheY, which 
can modify the direction of rotation of 
the flagellar motor through binding at 
certain locations. A second code affects 
the four kinds of receptors themselves by 
adding and removing methyl groups to 
any of eight different sites per receptor. 
The receptors are grouped into triplets 
on the membrane, so the number of 
possible methylation states is astronomi- 
cally large. The net outcome of these 
two coded processes is to permit the 
bacteria to “in effect perform calculus” 
(Bray, 2009, p. 94). It is not the absolute 
concentration of external stimulant that 


determines the decision to change direc- 
tion of movement, but rather a large 
change in the relative concentration 
(Bray, 2009, pp. 89-97). 

In Part 2, we flesh out our under- 
standing of cells as holistic entries 
whose hardware components must also 
be taken into account in addition to the 
interacting codes. It is wrong to think 
DNA provides detailed instructions on 
how to assemble an organism. Oyama 
(2002) pointed out that “a gene initiates 
a sequence of events only if one chooses 
to begin analysis at that point: it occupies 
no privileged energetic position outside 
the flux of physical interactions” (p. 40) 
and that “gene transcription and trans- 
lation in no way represent instructions 
for building a functioning body” (p. 
69). She correctly mentioned that the 
interactions needed to define organisms 
are inherited as already functioning cells 
and in a similar environmental context 


as the parent (pp. 17-18, 26, 43-49, 77). 


Dynamic Nature of Cellular Control 

The location of data in computer 
memory is rearranged in controlled 
manners and address pointers are used 
to identify the location of data. For 
cells this is also true, but the process is 
more sophisticated. A TF can search 
for a CRE in three-dimensional space 
and is robust to physical degradation of 
its target through mutations. Unlike a 
computer pointer to a single address, in 
cells n identical TFs or other signals can 
point to multiple locations to activate 
an ensemble of process-related genes. 
In computers, a memory address is usu- 
ally referenced directly, whereas in cells 
often a linked chain of pointers referenc- 
ing other pointers lead to the sites to be 
activated, which permit refinements, 
including fuzzy logic (Kosko and Isaka, 
1993; http://zadeh.cs.berkeley.edu) to 
be integrated at every step. 


Analog Computers 
We have not mentioned principles from 
the less-known analog computers in this 
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introduction to logic processing in cells. 
We only wish to point out here that the 
wide diversity in sensors responding 
to signals can produce a rheostat-like 
response (i.e., a continuum of response). 
Software designed for digital computers 
would process this kind of logic by defin- 
ing ranges of values for these variables 
and program the appropriate behavior 
for each range. This relates to our sug- 
gestion above that computer scientists 
consider using fuzzy variable and fuzzy 
values, being a principle cells use. 


Neo-Darwinism Fails to Explain 

the Origin of Logic Processing 
In The Plausibility of Life, we read, “The 
architecture of cells is achieved without 
an architect. No central regulation is 
discernible. Cells are in fact capable of 
many structures; many are chameleons 
that change their structure in response 
to circumstances” (Kirchner and Ger- 
hart, 2005, p. 148). It is correct that 
there is no set of instructions on DNA 
that specify the detailed order in which 
events are to unfold, but this does not 
deny an architect; in fact, it indicates 
a creator who designed for adaptability 
to changing circumstances (Truman, 
2015). As mentioned above, a virtually 
unlimited variety of responses can be 
executed by using enough variables 
and their values. Adaptability is found 
everywhere in biology, not only within 
cells. Gilbert (2003) provides several 
examples of dramatic polyphenism, or 
open systems adaptability, such as sex 
determination of blue-headed wrasse 
larva depending upon the presence 
of other males or females nearby; diet 
in caterpillars, which enables them to 
change their morphology to camouflage 
themselves according to season when 
born; and predator-secreted chemicals. 

Cellular process must be initiated 
and stopped. Runaway execution would 
rarely if ever be acceptable, but why 
should the termination rules develop 
in advance of these thousands of formal 
logic-guided processes? Which evolved 


pep + ery-4-P 


shikimate 


ra Ss 


sae eee eee ene eee eee eee eee mel 


chorismate 
r---) ——aaaaaaeee > 
nthranilate 
{ prephenate-----+---- 

bee 
tryptophan | NX 
1 phenylpyruvate hydroxyphenylpyruvate 
! 1 
phenylalanine tyrosine ------ : 


Figure 14. Enzyme chain including feedback in aromatic amino acid synthesis 


(Fell, 1997, p. 209). 


first, the process or the means to turn it 
off? Natural processes cannot look ahead 
to plan complex solutions to make cells 
and entire organisms adaptable. Gene 
regulatory networks, signal cascades, 
metabolic networks (Figure 14), and 
the operation of molecular machines are 
regulated at many levels using program- 
ming constructs recognizable by human 
designers. 

There is no analogy in inanimate 
matter of codes being used to express 
an intended result to ensure continued 
system integrity. This will become clear 
after examining in the next paper how 
extraordinarily complex the molecular 
machines are which are needed to 
implement the code specifications. 


In Figure 15, we clarify the principle, 
which is not found anywhere in inani- 
mate nature. 

The intuition is that a system with 
complex internal components will be 
repetitively confronted with a decision 
that can be freely made, independent 
of chemical or physical compulsion. 
For each iteration a particular choice 
between alternative paths is correct to 
facilitate the survival of the system (plus 
the decision-making apparatus), based 
on current circumstances. 

The cell is full of this decision prin- 
ciple, such as where to initiate and stop 
transcription, which amino acid to add 
next to a growing protein, and where a 
restriction enzyme should cut. 
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Figure 15. In inanimate nature we find no examples of systems with complex 
internal structure repetitively facing a contingent decision and then making the 
correct choice for each iteration based on which outcome supports survival of 


the system during that iteration. 


We do not find any examples in in- 
animate nature, even though this is but 
a minimalist requirement. We are not 
demanding this occurs reliably a huge 
number of times (like millions of correct 
peptide bonds during a cell’s lifetime) 
or that it be able to manufacture all its 
key components, or that the entire ap- 
paratus be reliably replicated for many 
generations. We ask only for examples 
showing the basic concept is found in 
some elementary way in inanimate 
nature. Otherwise, no evolutionary 
theory is justified in simply assuming 
grotesquely more complex cells arose in 
the absence of intelligent guidance. This 
is essentially asking about the origin of 
information, like the sequence of codons 
specifying the correct protein chains. 

We know from computer technolo- 
gies how important proofreading and 
error correction (parity bit rules, etc.) 
are during data storage and transmis- 
sion. In cells, this is far more important, 
given the many examples of iteration 


and millions of decisions per second 
involved. We will consider just one code, 
the genetic code, to illustrate the need 
for extreme reliability. If the multiple 
copies of mRNA and their translation 
products were error prone, this would 
lead to error catastrophe during the cell’s 
lifetime. Each new batch of flawed pro- 
teins and RNA would lead to ever more 
defective transcription factors, RNA 
polymerases, ribosomes, spliceosomes, 
error-correcting enzyme complexes 
and posttranslation machines, thereby 
producing ever more defective proteins 
and RNA the next time around. The 
same, of course, is also true about all 
the components inherited by daughter 
cells, in particular flawed DNA copies. 

Is this a serious problem? The 
probability that an amino acid will be 
translated correctly depends on many 
factors, but suppose that in the distant 
evolutionary past, before elaborate 
error-correcting molecular machines 
existed, natural processes had somehow 


miraculously reached a state where each 
of the twenty amino acids was translated 
correctly with an average probability of 
0.80 and that proteins back then were 
on average only 200 residues long. The 
chance of obtaining a correctly trans- 
lated protein would be (0.8)? = 4x10". 

One recent study of 40 proteins 
examined in HeLa cells concluded that 
the lowest number of copies per cell at 
a given time was for the oncogene FOS 
(6000 copies), and the most abundant 
was vimentin (20 million copies) (Zeiler 
et al., 2012). An ancient primitive or- 
ganism would not have so many copies. 
We would not expect to get even one 
correctly translated protein but a sea of 
hopelessly flawed, misfolded, and de- 
structively interacting ones (for a more 
exact analysis see Part 2). Even ifthe cell 
could somehow recognize and degrade 
mistranslated ones (somehow using 
molecular machines that themselves are 
hopelessly corrupted), the energy cost 
to produce enough attempts to gener- 
ate thousands of necessary good copies 
would be prohibitive. 

What is the reality in all cells studied? 
Success rates on the order of “only” 0.8 
per monomer copied? Many processes 
recognize and correct errors, such as 
when DNA is replicated or tRNAs are 
charged. In exonuclease proofreading 
during DNA replication, a mismatched 
duplex is identified and the most recent- 
ly incorporated nucleotides removed 
and replaced, eliminating about 99.9% 
of accidental misincorporations from 
the nascent strand (Kunkel Bebenek, 
2000; Ibarra et al., 2009). A second 
mechanism, postreplication mismatch 
repair, then corrects about 99% of those 
misincorporations that escape exonucle- 
ase proofreading (Modrich and Lahue, 
1996; Kunkel and Erie, 2005). There 
is also a molecular machine to repair 
double-strand (DS) breaks (Brissett and 
Doherty, 2009). 

The other codes must also be highly 
accurate. TT's could bind to a multitude 
of wrong locations on DNA; epigenetic 
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tags on histones or DNA must be care- 
fully controlled; flawed signal sequences 
would cause proteins to be secreted 
improperly; etc. Furthermore, the coded 
variables and variable values must be 
replicated accurately over many genera- 
tions, not just the organism’s lifetime. 
A distinct combination of millions of 
methyl tags on DNA cytosines is unique 
to each cell type and needs to be repli- 
cated in daughter cells (Carey. 2012, p. 
60) as shown in Figure 16. 

We do not know how accurately 
the methylation pattern per CpG must 
be replicated for the daughter lineages 
to still work, but suppose it would be 
enough if “merely” 1/10,000 need to 
be correct (i.e., 99.99% error rate would 
not matter, the resulting pattern would 
still work). Per replication and one mil- 
lion CpGs, a successful outcome would 
only occur 4x10-* of the time (i.e., 
0.99991000) even given such generous 
constraints. In other words, getting an ac- 
ceptable copy will not occur even if only 
1/10,000 tag positions need be correct 
on average. We conclude that evolving 
this new function cannot start crude and 
be refined by random mutations, since 
natural selection would have nothing 
functional or consistent to work on. 

Many researchers, especially those of 
a neo-Darwinian persuasion, continue 
to downplay the evidence for deliber- 
ate planning found in cells, preferring 
to hold on to the myth that most the 
genome is junk instead of facing the 
reality of multiple codes and an over- 
arching systems design. The origin of 
complex features is assumed to result 
from random mutations followed by 
natural selection without recognizing 
or addressing the origin of formal logic 
processing (Dawkins, 1996). Absent in- 
formational guidance, the only alterna- 
tive is to believe in a series of naturalist 
miracles, such as an initial functional ge- 
netic apparatus followed by many more 
miracles including a regulated energy 
source (ATP molecules) and require- 
ments such as being able to distribute 


Figure 16. DNA methylation patterns need to be replicated in daughter cells 
during somatic cell division. After each DNA strand is separated and the second 
strand copied, the DNMT1 enzyme searches for CpG motifs and transfers a 
methyl (Met) group to the new strand where needed. This results in two new 


copies carrying the original methylation pattern. 


chromosomes and other components 
to daughter cells. 

Is this also a probabilistic nightmare? 
There are 2°*?) = 5x10?’ ways to distrib- 
ute human chromosomes during mitotic 
cell division (Page and Hieter, 1999), of 
which only one is correct. There is a bet- 
ter chance to guess two people correctly 
in a row out of everyone who ever lived. 
And these odds need to be overcome by 
every surviving cell every generation, so 
once again error cascade is the natural 


consequence until the process is close 
to flawless. Natural selection is only 
relevant once the system has attained 
miracle-level perfection. 

In general, whenever we come 
across the terms “convergent evolution,” 
“genetic piracy,” or “co-optatation,” we 
will discover a failure of neo-Darwinian 
theory and in all likelihood further evi- 
dence that logic processing elements are 
being deliberately reused in unrelated 
organisms. For many years the very small 
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amount of data available was misused 
(and continues to be) to claim that a 
gene expressed as part of the same or 
similar processes reveals common ances- 
try. In the words of Striedter (Striedter, 
2003, p. 287), “Unfortunately, we now 
know that most genes are expressed in 
several different locations and that many 
homologies based on the expression pat- 
terns of single genes have turned out to 
be controversial, to say the least.” 

Until one accepts that cells are 
designed logic processors, much data 
will continue to be misunderstood. The 
same transcription factor or the same 
cis-factor pattern could be reused for 
biologically unrelated purposes across 
the biosphere. In programming, we also 
find software elements such as “for (int 
i = 0; 1 < myList.size(); i++)” in many 
programs, but this does not imply the 
programs are related in any manner. 
The i and myList could represent totally 
different things. 

In discussing the Pax-6 gene found 
in vertebrates, Drosophila, squid, and 
even flatworms, Willmer provides an 
example: “Although this could imply a 
common starting point for all eyes, it is 
more likely an example of the univer- 
sality of positional and pattern-forming 
determination systems in animals. Note 
also that while Pax-6 in vertebrates is ho- 
mologous to the Drosophila gene eyeless, 
other genes related to eye formation in 
vertebrates match bizarrely with genes 
involved in appendage formation and 
with muscle formation in fruit flies; and 
that Pax-6 also regulates the unrelated 
phenomenon of nasal placode formation 
in vertebrates” (Willmer, 2003, p. 38). 

Premature evolutionary speculation, 
treated and repeated for decades as prov- 
en scientific fact, is being increasingly 
corrected. Discussing the claim that 
the gene engrailed, which is expressed 
in both Drosophila and chordate meta- 
meres, proves that segmentation of body 
parts goes back 500 million years ago to 
acommon ancestor, Willmer explained 
what more data now actually reveals: 


“This now seems an overinterpretation. 
Although homeobox proteins function 
as transcription factors for other genes, 
the genes they regulate are often quite 
unrelated to segmentation. Furthermore, 
this same Hox sequence appears in a far 
greater range of animals, including un- 
segmented nematodes and echinoderms” 
(Willmer, 2003, p. 39). After providing 
other examples, Willner then arrives at 
the correct intuition: “The similarity of 
genes ... may lie in processes rather than 
in real homology” (p. 40). 


Scientific Guidance through 
the Design Presupposition 

The NIH Roadmap Epigenomics Con- 
sortium is collecting a huge database 
with DNA accessibility, RNA expres- 
sion, histone modification, and DNA 
methylation patterns for 111 human 
reference epigenomes (Kundaje et al., 
2015). One goal is to identify regulatory 
modules that arise during cell lineage 
specification and differentiation. This 
is representative of the general direction 
modern cellular research is beginning 
to take, where it has become indispens- 
able to apply principles from symbolic 
logic processing to understand in detail 
the design of cells. Speculative neo- 
Darwinism is at best post-facto storytell- 
ing; it provides no insights into the big, 
interesting biological questions. 

The view that cells were deliberately 
designed to be robust and adaptable 
for long-term viability and interactivity, 
along with the insights of logic process- 
ing principles from computer program- 
ming, stimulates many fruitful ideas 
to guide future ideas that do not arise 
from the evolutionary worldview. Freed 
from the shackles of possible biological 
functions being constrained to what a 
primitive common ancestor initially 
provided and a limitation on mutational 
accidents to generate nontrivial novelty, 
we suggest how our paradigm provides 
value to guide future research. 

1. Cells will be found to be more adapt- 
able than suspected to situations 


not encountered before, and when 
the mechanisms are researched, 
we will find the adaptive logic has 
coding aspects, meaning the vari- 
ables were already there and able 
to process additional values. Asking 
how one would formally design an 
optimized outcome, independent of 
any misguided prejudice from com- 
mon ancestry constraints, should 
help identify new cellular control 
processes. (Post-facto claims for 
unexpected “convergence” is scien- 
tifically worthless and contradicts 
neo-Darwinian expectations.) 
Many more forms of complex 
regulation remain to be discovered 
than suspected. No iterative process 
will be found that lacks a formal 
set of rules on how to initiate and 
terminate (unless malfunctioning). 
Whenever it would make sense for 
the concentration and distribution 
of biomolecules to vary, we predict 
evidence will be found this has 
been implemented in a context- 
appropriate fashion. 

Given our conviction that cells were 
designed to function as holistic and 
integrate entities, we predict ever 
more discoveries of interconnectiv- 
ity between codes so that inputs 
throughout the entire cell and eco- 
system can be taken into account 
to regulate processes optimally. We 
expect much will become clear only 
as the optimization trade-offs are 
understood and that quantitative 
analysis will reveal there could not 
have been nearly enough evolution- 
ary trial-and-error attempts to explore 
and fine-tune these optimized trade- 
offs. 

More quality control checks will be 
discovered at key processing points. 
Researchers should search for error 
checks/correction during transcrip- 
tion to RNA and other key interfaces. 
Considering the value to cells of 
recycling valuable raw materials 
of every kind, we anticipate novel 
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discoveries designed to ensure this. 
Conversely, if substances (like cyclic 
RNAs) are found to be long-lived, we 
suggest the Creator had a biological 
reason. 

5. We expect that when important 
alternative pathways are available, 
the overall optimal one under 
those circumstances is selected un- 
less clearly malfunctioning. As an 
example, whether to attempt error 
correction or initiate apoptosis is a 
significant decision for cells based 
on complex cost/benefit/risk trade- 
offs. We expect a careful quantita- 
tive application of decision theory 
principles—including Bayesian sta- 
tistics—to reveal that the outcome 
selected is overall rational. 

6. For every difficult step creating a 
critical potential processing bottle- 
neck, mechanisms will be found 
that resolve these, in the same way 
that we expect that an enzyme will 
be found to catalyze all key bio- 
chemical reactions impacting the 
survival of a cell. We also anticipate 
that variants of current enzymes and 
processes can easily be generated 
when it makes sense. ‘This is based 
on our view that general-purpose so- 
lutions were often designed, which 
like good open systems design, are 
adaptable. Optimized adaptability 
has nothing to do with the naturalist 
assumptions going under the label 
evolution. 

7. Since we believe organisms were 
created optimally (with the goal of 
filling the earth’s ecosystems) but 
have accumulated errors over time, 
we will discover residual evidence for 
functioning solutions in the past, at 
the cellular or higher level, which do 
not work as well as before, especially 
for organisms that have undergone 
population-size bottlenecks. Apply- 
ing design reasoning to describe how 
ideal solutions would work will help 
us understand how things might 
have worked before. 


8. We will discover multidimensional 
forms of data storage and retrieval 
not known for computers. These 
will be sophisticated beyond any- 
thing a naturalist would dare pre- 
dict. We anticipate the existence of 
extraordinary code-based methods 
to store, retrieve, index, network, 
and consolidate in fuzzy logic and 
other mathematical forms all kinds 
of multimedia data (smell, vision, 
taste, sound, tactile memories, 
reasoning chains, numbers, facts, 
etc.) in ensembles of brain cells. 
We dare predict human minds will 
be found to be able to interact with 
these codes in read/write fashion to 
actively guide queries in a parallel 
processing fashion. 
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Citrate Utilizing Mutants 
of Escherichia coll 


Kevin Anderson* 


Both the neo-Darwinian evolution 
model and the biblical creation model 
predict that organisms will adapt to 
their environment. However, they make 
distinctly different predictions regarding 
the manner of this adaptation. These 


Abstract 
(se of a “long-term evolution experiment,” populations of 

scherichia coli have been grown for thousands of generations in 
a consistent environment. During this experimental period, various 
mutations have altered the bacteria’s phenotype. Some of these phe- 
notypic changes have included larger cell size and faster growth rates. 
The wild-type strain of E. coli can use citrate as an energy source in an- 
aerobic conditions, but not in aerobic conditions. However, after 31,500 
generations, a population of mutants developed that could aerobically 
utilize citrate. The formation of these Cit* mutants entails an intrigu- 
ing series of mutational steps involving both the citrate operon and 
other metabolic related genes. Thus, this new phenotype is frequently 
identified as (1) an example of the “birth of new genes,” and (2) how 
random mutation and natural selection can drive neo-Darwinian evolu- 
tion. However, all the mutations detected in the Cit* phenotype involve 
rearrangement of preexisting genes, loss of preexisting gene expression, 
or loss of preexisting regulation. Thus, the Cit* mutants fail to provide 
a genetic example of the origin of new genes or regulatory systems. In 
contrast, these mutants fit precisely within predictions of a creation 
model; organisms have a programmed ability to adapt to specific envi- 


ronmental conditions. 


differing predictions can be useful in 
evaluating the accuracy and scientific 
validity of each model. 

The creation model proposes that 
life arose from the direct action of a 
creator. Organisms do not share a com- 
mon ancestry with all other organisms. 
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For example, elephants were created as 
the elephant “kind,” and do not share 
a common ancestry with trees, fish, or 
primates. DNA mutations and other 
genetic events can alter an organism’s 
phenotype, but such changes are either 
from mutations (or other forms of DNA 
damage) or part of the programmed 
ability of the organism to adapt. 

Thus, the creation model predicts 
that organisms were created with the 
capacity to undergo adaptive modifica- 
tions that can help them survive environ- 
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mental changes of climate, nutrition, etc. 
Such modifications can involve physi- 
ological adjustments, such as increasing 
or decreasing cellular levels of specific 
proteins. ‘These modifications may also 
involve genetic alterations, which can 
include mutations, horizontal gene 
transfer, and epigenetic events. 

There are also limits to the extent of 
these programmed adaptations. Bacte- 
tial antibiotic resistance is a classic 
example. Individual populations of 
bacteria develop resistance to specific 
antibiotics by mutation, gene transfer, 
and possibly epigenetic factors. In the 
presence of the antibiotic, this resistance 
gives a clear adaptive advantage over 
populations that lack resistance. Yet, 
gene transfer merely moves preexisting 
genetic systems into the cell, and muta- 
tions that provide resistance are gener- 
ally degenerative (Anderson, 2005). 

In contrast, the neo-Darwinian evo- 
lution model predicts all life on Earth 
shares a common ancestty (i.e., univer- 
sal common descent). Contemporary 
life slowly originated and transformed 
from lower forms of life over immense 
periods of time. As such, vast physiologi- 
cal and genetic changes have occurred 
during biological history, leading to the 
development of new functional systems 
(e.g., wings, legs, eyes, and brains). 

Such changes are required by neo- 
Darwinism to account for universal 
common descent. Organisms must 
not only have a mechanism to adapt to 
differing environmental conditions (as 
also predicted by the creation model) 
but also have an additional mechanism 
to develop new genetic systems that 
they did not previously possess. This 
mechanism must account for the “birth” 
of new genes, new regulatory controls, 
new proteins, and new transport systems. 
Only with such a mechanism can inver- 
tebrates become vertebrates, flightless 
creatures acquire flight, and marine 
creatures acclimate to an air-breathing 
physiology. Thus, a key difference be- 
tween the two models is that creation 


predicts only limited changes, whereas 
neo-Darwinian evolution requires al- 
most unlimited levels of change. 

Proponents claim that the mecha- 
nism for this Darwinian transformation 
is random mutations, which alter an 
organism’s phenotype (Carlin, 2011; 
Merlin, 2010). Ifthe alteration provides 
an adaptive advantage, natural selection 
will promote its spread within the popu- 
lation (Patterson, 1978; Merlin, 2010). 
Dawkins (1996, p. 79) refers to this as 
“the non-random survival of random 
variants.” 

Ruiz-Orera et al. (2015) note that 
“the birth of new genes is an important 
motor of evolutionary innovation.” Yet 
the search for examples of random muta- 
tions that give birth to these new genes 
has proven rather difficult (Anderson 
and Purdom, 2008; Depew and Weber, 
2011; Margulis and Sagan, 2002; Noble, 
2015). In fact, historically, Darwinists 
have generally focused on phenotypic 
changes with little attention to the 
resulting genotype. If the phenotype is 
positively selected, then it is assumed to 
be an example of the necessary “gene 
birth” (e.g., Dawkins, 1996; Coyne, 
2009; Zimmer, 2001). 

However, phenotypic advantage for 
an organism can frequently result from 
genetic degeneration. Some classic 
examples would be loss of transport pro- 
teins that provide antibiotic resistance 
to bacteria (Anderson, 2005), malaria 
resistance by improper folding of he- 
moglobin (Cholera et al., 2008), and 
HIV resistance due to loss of the CCR5 
protein (Allers et al., 2011). Each of 
these phenotypes has been offered as 
an example of “evolution in action,” 
yet all directly result from mutations 
that reduce or eliminate preexisting 
functional systems. 

This illustrates a key difference 
between genotype and phenotype of 
an organism. Mutations that provide a 
“beneficial” phenotype can result from 
a degenerative genotype. These types of 
mutations cannot account for the origin 


of the CCR5 protein, hemoglobin, 
transport systems, etc.—only their de- 
generation. This is fully consistent with 
a creation model; proteins and regula- 
tory systems were created fully formed 
and changes are limited. In contrast, 
these do not provide an example of the 
required neo-Darwinian mechanism for 
the origin of genetic systems, nor the 
virtually unlimited potential of biologi- 
cal change. 

In search of examples for this genetic 
mechanism, a popular study frequently 
cited by neo-Darwinists is the “long-term 
evolution experiment” (LTEE) using 
Escherichia coli. Results of this experi- 
ment provide a fascinating variety of dif- 
ferent mutants with various phenotypic 
adaptations, including the development 
of a new citrate utilizing phenotype. 
Results from the LTEE have also given 
rise to various claims about the adaptive 
capabilities of bacteria to generate new 
genes and functions. However, the cri- 
terion for these conclusions is primarily 
a positively selected phenotype, while 
genotypic changes are frequently a sec- 
ondary consideration. 


Long-Term 
Evolution Experiment 
In 1988 Richard Lenski began an in- 
teresting study of bacterial adaptation. 
Using a culture of E. coli B, substrain 
REL606 or RELG607 (which are des- 
ignated as the wild-type strains), the 
bacterium was cultivated aerobically in 
growth medium that contained a lim- 
ited amount of glucose (which served 
as the sole energy source) (Lenski, 
2010). The organism was incubated 
in 12 separate flasks (six containing 
REL606 and six containing REL607) 
at a consistent 37°C. Cultures in each 
flask were allowed to grow for 24 hours 
(approximately 6.6 generations), and 1% 
of each culture was transferred to 100 
ml of fresh media for another 24 hours. 
Every 500 generations, a sample from 
each flask was used for detailed analysis 
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of both the phenotype and genotype of 
the bacterium. 

Because of the limited glucose in the 
medium, the wild-type strain achieved 
only modest growth during each 24-hour 
period. Over time, subpopulations of the 
bacterium began to adapt to the medium. 
After 20,000 generations, some mutants 
had developed a larger cell size and grew 
approximately 70% faster than the wild- 
type strain (Lenski and Travisano, 1994; 
Novak et al., 2006). However, Lenski 
(2003) estimates that despite millions 
of mutations occurring in the bacteria 
during the first 20,000 generations, less 
than 100 point mutations and no more 
than 20 “beneficial” mutations became 
fixed within the population. 

While some mutants grow signif- 
cantly faster than the wild-type at the 
experimental temperature (37°), they 
generally have a reduced growth rate 
at lower (20°) and higher (42°) tem- 
peratures (Cooper et al., 2001a) and a 
reduced cold-stress tolerance (Sleight 
et al., 2006). These mutants also have a 
decreased survival in prolonged station- 
ary phase. Those mutants with enlarged 
cell sizes show an increased susceptibil- 
ity to osmotic pressure (Philippe et al, 
2009). Some of these faster growing 
mutants lost MutT activity, which dra- 
matically increases their mutation rate, 
especially mutations causing a transver- 
sion (purine > pyrimidine conversions) 
(Barrick et al., 2009). In addition, some 
of these mutants have a reduced level 
of ribose operon activity (Cooper et al., 
2001b), maltose regulon activity (Pelosi 
etal., 2006), or flagella activity (Cooper 
et al., 2003). 

This trade-off of features has been 
termed “antagonistic pleiotropy,” where 
an adaptive advantage is gained by 
losing a function or system that is not 
essential for the current environment 
(Cooper and Lenski, 2000; Ostrowski et 
al., 2005). Interestingly, in the absence 
of examples showing “the birth of new 
genes” by random mutations, evolution- 
ists have frequently pointed to antagonis- 


tic pleiotropy as a driving mechanism 
for evolutionary innovation (e.g., Rose, 
1985; Fry, 1993; Olson, 1999; Roff and 
Fairborn, 2006). This phenomenon is 
even identified as the primary contribu- 
tor to mutational adaptations during the 
LTEE (Cooper and Lenski, 2000). 

However, antagonist pleiotropy is a 
degenerative event. Preexisting systems 
are lost, even if that loss provides a tem- 
poral or conditional advantage for the 
organism. While antagonistic pleiotropy 
fits within the predictions of both the 
neo-Darwinian and creation models, it 
fails to offer a genetic mechanism for 
the origin of the lost systems. Thus, it 
does not provide neo-Darwinian evolu- 
tion the needed genetic mechanism for 
universal common descent. 

What is more, identifying the forma- 
tion of a “new function,” “new system,” 
“new gene,” etc. is very context depen- 
dent. Is it a “new function” because the 
organism now metabolizes a unique 
substrate it could not previously utilize, 
or is it simply that a preexisting enzyme 
has a reduced specificity enabling it to 
bind with a larger pool of substrates? Is 
ita “new system” because the organism 
has a different growth characteristic, or 
did loss of one or more transport proteins 
alter its physiological balance? Did the 
organism form a “new gene” that it did 
not previously possess, or is it merely ex- 
pressing a “silent” gene because of loss of 
a regulatory protein? Do these represent 
evolution of new features or simply loss 
of preexisting systems? Making these 
distinctions illustrates the importance 
of identifying changes to the genotype 
and not just the phenotype. 


Citrate Mutants 


Asa chelating agent, a low concentration 
of citrate was added to the base growth 
medium used during the LTEE (Lenski, 
2010). The wild-type strain (E. coli B) is 
not able to utilize citrate as an energy 
source (Cit) in an aerobic environ- 


ment (Scheutz and Strockbine, 2005). 


However, after 31,500 generations, 
Cit* mutants were detected that could 
aerobically utilize citrate (Blount et al., 
2008). Since the primary energy source 
(glucose) was limited in the medium, 
mutants that could also utilize citrate 
for additional energy possess a distinct 
growth advantage versus the wild-type 
strain. 

These Cit* mutants have become a 
popular example of “evolutionary inno- 
vation” by mutation and selection, and 
they certainly provide a unique insight 
into the adaptive capacity of bacteria. 
In fact, a standard classification charac- 
teristic of E.. coli is that it cannot utilize 
citrate in aerobic conditions (Scheutz 
and Strockbine, 2005). Thus, by this 
definition, the Cit* mutants are no 
longer E. coli. While Lenski and cowork- 
ers are not yet claiming a new species 
of Escherichia has “evolved” (Blount 
et al., 2008), they point to the Cit* 
mutants as examples of how speciation 
can occur. Not surprisingly, this makes 
these mutants of special interest to both 
creationists and evolutionists. 


Citrate Operon 
E. coli can utilize citrate as an energy 
source, but only in anaerobic condi- 
tions. The citrate operon of E. coli B is 
comprised of several genes (Figure 1). 
A characteristic of bacterial operons is 
that transcription of multiple genes is 
initiated by the same promoter, result- 
ing in a polycistronic transcript (i.e., 
mRNA with multiple open reading 
frames). In the case of the cit operon 
(citCDEFXGT), the promoter is adja- 
cent to citC. When citrate is available 
to the cell, this promoter is activated 
by the CitA-CitB signal transduction 
system (Yamamoto et al., 2008). This 
transduction system is the product of citA 
and citB located upstream of the citrate 
operon (Figure 1). CitA isa membrane 
protein that apparently is inactivated by 
an increase of the redox potential within 
the cell (Yamamoto et al., 2009). This 
inactivation prevents CitA-CitB from 
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Figure 1. Citrate operon and upstream genes in E. coli B. P, and P, denote promoter locations, and arrows indicate direc- 
tion of transcription. The P, promoter is activated in anaerobic conditions, but remains inactive in aerobic conditions. 


(Adapted from the E. coli B str. REL606 genome sequence provided by www.paricbrc.org.) 


functioning under aerobic conditions, 
so the citrate operon is not expressed 
(Scheu et al., 2012). 

CitT functions as a transport pro- 
tein for citrate, specifically serving as a 
citrate/succinate antiporter (Pos et al., 
1998) (Figure 2). The lack of CitA-CitB 
activity under aerobic conditions pre- 
vents expression of Citl’. Without this 
protein, E. coli cannot transport citrate 
into the cell, which results in the aerobic 
Cit phenotype. 

Interestingly, when citT was placed 
under direct control of a plasmid pro- 
moter (e.g., insertion into pUC19), 
expression of the gene occurs aerobically 
(Pos et al., 1998). E. coli transformed 
with this plasmid is Cit* in aerobic con- 
ditions. Thus, the lack of CitT appears 
to be the primary limiting factor for 
aerobic utilization of citrate. Many of 
the other cit operon genes are involved 
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Figure 2. Antiporter activity of CitT 
showing simultaneous import of citrate 
and export of succinate. 


in anaerobic utilization of citrate and 
thus are not necessary for the aerobic 
Cit* phenotype. The study of Pos et al. 
(1998) demonstrates that F. coli already 
possesses metabolic pathways for aerobic 
catabolism of citrate; the organism just 
lacks an aerobic transport system. 


Mutant Genotype 

The initial report of the Cit* mutants 
contains little genotypic information 
(Blount, etal. 2008). Rather, the primary 
focus of the report was the phenotypic 
characteristics of the mutants: time of 
appearance in the population, growth 
characteristics, and frequency of reoc- 
currence. Using stored isolates from 
earlier generations, the researchers 
attempted to retrace the development 
of the Cit* phenotype. They observed 
that Cit* is more likely to reappear in a 
population after 20,000 generations than 
from populations of earlier generations. 
Blount etal. (2008) suggest this indicates 
a potentiating mutation occurred about 
10,000 generations before the first Cit* 
phenotype was detected (making subse- 
quent development of the Cit* mutants 
more likely). 

Since the total number of predicted 
mutations during the first 30,000 genera- 
tions is far greater than the genome size 
of the E. coli, this would suggest that the 
Cit* phenotype did not result from just 
a single point mutation. After a more 
complete genetic analysis, Blount et al. 
(2012) conclude that one or more initial 
mutations (at or after 20,000 generations) 
are required before the Cit* phenotype 


develops. From this they deduce that 
the formation of the phenotype involves 
three steps: potentiation, actualization, 
and refinement. The potentiation step 
establishes a physiological background 
that enables the aerobic Cit* phenotype 
to ultimately develop. The actualization 
step marks the occurrence of a weak 
Cit* phenotype, and the refinement step 
enables the mutant to more proficiently 
utilize citrate. 

A number of different mutations 
were found in the Cit* population, as 
well as the population that gave rise to 
the Cit* phenotype. Many of these mu- 
tations appear to be unrelated to citrate 
utilization. However, several have been 
identified as either directly or indirectly 
related to the development of the Cit* 
mutants (‘Table 1). 


Potentiation 
As a possible potentiation step, Quandt 
etal. (2015) identified several mutations 
that appear in the E. coli population at 
25,000 generations. In particular, they 
found that a point mutation of gitA may 
help establish a favorable physiological 
background for the subsequent develop- 
ment of the Cit* phenotype. Expression 
of gltA produces citrate synthase. This 
enzyme catalyzes the first step of the 
Krebs cycle: condensation of oxaloac- 
etate and acetyl-CoA. Citrate synthase is 
inhibited by NADH. As NADH levels in- 
crease, the synthase becomes less active. 
This inhibition helps the cell maintain 
its redox homeostasis. As the level of re- 
duced electron carries increases, Krebs 
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Table 1. Genes identified in the potentiation, actualization, and refinement steps during the development of the E. coli 





Cit* phenotype. 


cycle activity slows, decreasing oxidative 
release of electrons. 

E. coli can produce acetate as an 
electron sink during glucose metabo- 
lism (Wolfe, 2005). Once the medium’s 
limited glucose supply is depleted, the 
bacterium can utilize this acetate for ad- 
ditional energy. A point mutation (iden- 
tified as gltAl) reduces the inhibitory 
effect of NADH, uncoupling the Krebs 
cycle from cellular levels of reduced 
electron carriers. As a result, the cell can 
achieve an increase of citrate synthase 
activity, which drives greater conversion 
of acetate to citrate. This would help 
maximize acetate catabolism after the 
glucose is consumed. 

In addition, mutations to icIR and 
arcB also may contribute to a potentiat- 


ing background (Quandt et al., 2015). 
Both IclR and ArcB are regulatory pro- 
teins that reduce expression of enzymes 
involved in the Krebs cycle and the 
glyoxylate shunt pathway. This shunt 
is an important pathway for acetate 
metabolism, as it enables acetyl-CoA 
to be converted to important anabolic 
intermediates (e.g., malate and oxalo- 
acetate) rather than be lost by decar- 
boxylation in the Krebs cycle (Cronan 
and Laporte, 2006) (Figure 3). The 
detected mutations of icIR and arcB 
reduce the inhibitory effect of IcIR and 
ArcB, enabling increased production 
of Krebs cycle and glyoxylate pathway 
enzymes (Quandt et al., 2015). As with 
the gitA] mutation, these mutations may 
potentially increase acetate utilization 


and also reduce acetate excretion. 
While evidence suggests gltAl, and 
perhaps to a lesser extent the ic/R and 
arcB mutations, have a potentiating 
role in development of the Cit* phe- 
notype, their exact contribution to this 
phenotype is still not clear (Quandt et 
al., 2015). Acting together, these muta- 
tions increase acetate metabolism by 
increasing activity of the Krebs cycle 
and the glyoxylate pathway (Figure 3). 
Presumably, this helps increase the 
intracellular level of succinate or other 
C,-dicarboxylates that can be a product 
of acetate metabolism. Since these 
molecules are exported in exchange for 
citrate (Figure 2), their increased cel- 
lular concentration may be a beneficial 
first step for increased citrate transport. 
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Figure 3. Pathway of acetate metabolism. Several potentiating mutations appar- 
ently drive greater activity of the Krebs cycle and the glyoxylate shunt pathway 
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Figure 4. Insertion of an IS3 element 
into different locations of citG. The 
IS3 promoter enables aerobic expres- 
sion of the downstream citT. 


Interestingly, when the Cit* phe- 
notype develops, these potentiating 
mutations likely become detrimental 
(Quandt et al., 2015). When citrate 
is the sole carbon and energy source, 
mutations increasing acetate utiliza- 
tion no longer provide a benefit. These 
mutations may cause compounds (such 
as acetyl-CoA and oxaloacetate) to be 
unnecessarily catabolized rather than 
serve as needed anabolic intermediates. 
These mutations may also make it more 
difficult for the Cit* mutants to maintain 
a redox homeostasis when catabolizing 
citrate. 


Actualization 

Within the variety of mutations found in 
the Cit* population, only two appear to 
be essential for the full Cit* phenotype 
(Quandt et al., 2014). One of these 
mutations is a “promoter capture” event. 
This capture involves placing citT under 
the control of an alternate promoter. 
Since such captures were detected in 
all Cit* mutants studied, this suggests 
it is a key mutation for aerobic citrate 
utilization. However, this mutation only 
results in a weak Cit* phenotype. Thus, 
it is possibly the actualization step pro- 
posed by Blount et al. (2012). 

One form of promoter capture 
results from the insertion of the IS3 ele- 
ment into different sites of citG. Since 
IS3 possess a promoter that can activate 
adjacent genes (Charlier et al., 1982), 
this insertion initiates transcription of 
the downstream citT (Figure 4). The IS3 
promoter is not under aerobic/anaerobic 
regulation; thereby it enables expression 
of citT’ in aerobic conditions. 

A single Cit* mutant was found 
to have an inversion that moved the 
citG/citT segment downstream of fimB 
(Blount et al., 2012) (Figure 5). This 
enables the fimB promoter to initiate 
expression of citT. As with the IS3 pro- 
moter, the fimB promoter is not under 
aerobic/anaerobic control, so citT ex- 
pression can occur aerobically. 

All Cit* mutants that were analyzed 
also possess a rearrangement bringing 
the citG/citT/rna region downstream of 
the rnk gene (Blount et al., 2012) (Fig- 
ure 6A). This rearrangement results in a 
portion of citG aligning with a segment 
of rnk. This forms a rnk/citGT fusion, 
enabling citT expression to be driven 
by the rnk promoter (Figure 6B). The 
mk promoter enables a low expression 
of citT’ in aerobic conditions, resulting 
in a weak Cit* phenotype (Blount et 
al, 2012). 

In addition, several Cit* mutants 
possess tandem duplications of the rnk/ 
citGT fusion (Figure 6C). An increased 
number of fusion copies appear to 
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Figure 5. Chromosome rearrangement moves citG and citT downstream of fimB. 
This allows expression of citT via the fimB promoter. 


increase the production of the CitT 
transporter protein (Blount et al., 2012). 
This results in a stronger Cit* phenotype. 
Potentially, the increase in cellular levels 
of this transporter (1) increases internal 
levels of citrate, driving greater citrate 
metabolism, and (2) enables the cell to 
more effectively scavenge the low levels 
of citrate in the medium. 


Refinement 

As mentioned above, the CitT transport- 
er functions as an antiporter; exchanging 
citrate for C,-dicarboxylates (such as 
succinate or fumarate) (Pos et al., 1998). 
In this capacity, CitT transports citrate 
into the cell and simultaneously exports 
succinate (Figure 2). Inadequate levels 
of a C,-dicarboxylate will reduce the 
activity of the CitT antiporter, decreas- 
ing citrate transport. 


actA Mutation 

Quandt et al. (2014) observe that the 
other mutation necessary for the full 
Cit* phenotype involves dctA. This gene 
expresses a protein that transports suc- 
cinate and other C,-dicarboxylates into 
the cell (Golby et al., 1999; Steinmetz et 
al., 2014) (Figure 7). E. coli has several 
anaerobic transport systems for C,-dicar- 
boxylates, but only DctA appears to serve 
as an aerobic transporter (Davies et al., 
1999). DctA cotransports succinate and 
protons (H*), linking it with the proton 
motive force of the organism. 


In the Cit* mutants, a mutation is 
located 20 bases upstream of the dctA 
open reading frame, suggesting it falls 
within the promoter/operator region of 
the gene. This mutation causes overex- 
pression of detA (Quandt et al., 2014), 
giving the cell a higher copy number of 
the DctA transporter. In turn, increased 
numbers of DctA can increase the 
level of imported succinate, providing 
more succinate for the CitT’ antiporter 
(Figure 8). 

The dctA mutation alone does not 
yield a Cit* phenotype and may actually 
be deleterious to the bacterium (Quandt 
et al., 2014). However, without the 
overproduction of DctA, the mutants 
do not achieve the full Cit* phenotype. 
Apparently without excess DctA, insuf- 
ficient succinate is available to drive full 
activity of CitT’. Thus, the detA mutation 
is probably a “refinement.” Loss of the 
dctA mutation in Cit* mutants reduces 
their level of citrate utilization (Quandt 
et al., 2014). 

Because E. coli can oxidize citrate to 
succinate (Figure 9), each mole of trans- 
ported citrate could result in the produc- 
tion of an equal mole of succinate for 
cotransporting by CitT’. By this scenario, 
the role of DctA (esp. its overproduction) 
would appear unnecessary and certainly 
not a key component for the full Cit* 
phenotype. Rather, succinate production 
from the oxidation of citrate should be 
sufficient for full CitT activity. 


However, it is unlikely that E. coli 
makes equal moles of succinate per mole 
of citrate. Since no amino acids and only 
one vitamin (thiamine) were added to 
the growth medium of the LTEE, E. coli 
must biosynthesize these compounds. 
Using various metabolic intermediates, 
E. coli can synthesize different amino 
acids and vitamins (e.g., a-ketoglutarate 
serves as the anabolic intermediate pre- 
cursor for biosynthesis of glutamate and 
proline) (Lengeler et al., 1999). Since E. 
coli does not normally catabolize citrate 
under aerobic conditions, some details 
of this metabolism currently remain 
unknown. 

The tandem duplicates of the rmk/ 
citGT fusion enable overproduction of 
CitT. High copy numbers of CitT pro- 
vide increased importing of citrate, even 
when the citrate concentration in the 
medium is low. However, increased CitT 
activity requires more C,-dicarboxylate 
for cotransporting. Hence, increased ac- 
tivity of dctA assures that sufficient levels 
of succinate are available to drive full 
activity of the multiple copies of CitT 
(Figure 8). The overproduction of DctA 
fits within this physiological setting, and 
explains the results observed by Quandt 
et al. (2014). Correspondingly, the over- 
production of CitT balances the excess 
amounts of succinate being imported by 
the high levels of DctA. 

Thus, the two mutations appear to 
have an epistatic relationship. Excess 
production of DctA has no benefit 
without being counterbalanced by the 
excess production of Cit’ Also, without 
sufficient succinate for cotransporting, 
excess production of CitT’ only gives 
a weak Cit* phenotype. Hence, over- 
production of both CitT and DectA is 
necessary for the full Cit* phenotype, 
while overproduction of one transporter 
without overproduction of both has far 
less (if any) benefit for the bacterium. 


Secondary Mutations 
While not necessary for the full Cit* 
phenotype (which only requires the citT’ 
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potentiating mutations are potentially 
detrimental. 

Quandt et al. (2015) found that a 
group of mutations reduce or elimi- 
nate the effect of gitAl. Designated 
as gltA2, these secondary mutations 
of gitA reduce citrate synthase activity 
in the cell. This reduction of synthase 
activity potentially helps the mutants 
maintain a carbon and redox balance 
when catabolizing citrate. Interestingly, 
in the absence of citrate as a carbon and 
energy source, the gitA2 mutations were 
generally deleterious to the bacterium 
(Quandt et al., 2015). 

By 34,000 generations, a nonsense 
mutation in aceA also appears in most 
of the Cit* mutants (Quandt et al., 2015). 
The aceA gene expresses isocitrate 
lyase, which diverts isocitrate from the 
Krebs cycle into the glyoxylate shunt. A 
nonsense mutation of aceA causes an 
inactive form of isocitrate lyase to be 
produced. The lack of this lyase activity 
enables more isocitrate to be oxidized to 
a-ketoglutarate (Figures 3 and 9). This 
mutation likely compensates for the 


rnk/citG ~~ citT rna_sornk/citG ~—scitT rnasrnkf/citG ~—scitT 


’ ’ 


CitT CitT 


effect of the potentiating icl/R mutation, 
which increases shunting of isocitrate 
rather than its continued oxidation in 


the Krebs cycle. 


CitT 





Darwinian Significance 


Since the initial report of the Cit* mu- 


Figure 6. Illustration of the rnk/citGT fusion. (A) chromosome rearrangement 
moves a copy of the citG/citT /rna region downstream of rnk. (B) A fusion joins 
segments of rnk and citG, creating a hybrid rnk/citG region. The transcript of this 
region produces a short peptide with no identified function. The downstream citT 
is now under transcriptional control of the rnk promoter, which can function in 
aerobic conditions. (C) Duplication of the fusion results in multiple copies of 
the rk/citG fusion and downstream citT and rna. The product of the rna gene is 
RNase I, which has not been shown to be directly relevant to the Cit* phenotype. 


tants provides little genotypic informa- 
tion, this had the effect of creating a 
type of mystique regarding these organ- 
isms. Many assumptions and conclu- 
sions about the mutants’ evolutionary 
significance were offered with little 


and detA mutations), some additional 
mutations potentially contributed to 
the development of the original Cit* 
mutants. These mutations negated the 
effect of the earlier potentiating muta- 
tions. As mentioned above, the potenti- 


ating mutations may have established a 
physiological background for the Cit* 
phenotype to develop (e.g., acetate 
metabolism driving the production of 
excess succinate for exporting), but once 
the citT and dctA mutations occur, the 


actual understanding of their genetics 
or physiology. Add to this the apparent 
multistage “evolutionary innovation,” 
and the development of the Cit* phe- 
notype has become a popular “show 
piece” for evolutionists (Hendrickson 
and Rainey, 2012; Pennisi, 2013). 
Among the claims are that subpopu- 
lations of E. coli were able to “evolve” 
a novel capability they did not previ- 
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Figure 7. Coordination of CitT and DctA activity. Import of succinate by DctA 
provides antiporter for CitT to cotransport citrate. 
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Figure 8. Epistatic relationship of CitT and DctA. High copy numbers of DctA 
increases transport activity of succinate into the cell, providing more succinate for 
antiporting by CitT. In turn, this allows high copy numbers of CitT to function, 
enabling higher levels of citrate to be transported into the cell. If copy numbers 
of DctA decline, then cellular levels of succinate decrease, which decreases CitT 
activity. 


ously possess. It is even suggested that 
“a complex new function develop[ed] 
seemingly from scratch” (Pennisi, 2013, 
p. 793). Venema (201 2a, 2012b) points 


to the results of the LTEE, and the Citt 
phenotype in particular, as a “dem- 
onstration that new genes can indeed” 
evolve, and this is an example of a 


mutation that “creates a new regulatory 
element.” Indeed, formation of these 
Cit F. coli mutants is frequently offered 
as the quintessential example of evolu- 
tion in action—the evolution of a new 
function possibly giving rise to a new 
species. With more genetic details of the 
Cit’ mutants now available, this claim 
can be more closely assessed. 

Blount etal. (2012) suggest the need 
for a potentiating mutation (sometime 
after 20,000 generations) to provide the 
genetic background for the ultimate de- 
velopment of the Cit* phenotype. Some 
populations of the LTEE developed a 
hypermutation phenotype (Barrick et 
al., 2009; Wielgoss et al., 2013), which 
presumably increases the likelihood of 
beneficial mutations developing and 
boosts the organism’s fitness trajectory 
(Wiser et al., 2013). Interestingly, the 
Cit* phenotype did not arise from these 
hypermutable populations (Blount et 
al., 2008). 


New Gene or Regulatory Element? 
The rnk/citGT fusion captures the pre- 
existing rnk promoter via chromosomal 
rearrangement. Gene duplication then 
results in multiple copies of this fusion. 
Rearrangements and duplications are 
common in enteric bacterial chromo- 
somes (Roth etal., 1996; Matthews et al., 
2011). The promoter for the cit operon 
is not part of the citT gene; hence, citT 
remains intact following this rearrange- 
ment. A new gene is not formed. Fusion 
of the 3’ end of mk with the 5’ end of 
citG formed a hybrid mk/citG region, 
which produces an 89 amino acid pep- 
tide with no reported activity (Blount 
et al., 2012). This does not constitute 
formation of a new gene, nor does the 
small peptide appear to have any rel- 
evance to the Cit* phenotype. Rather, 
functionality of both mk and citG are 
lost in forming this fusion. 

A new promoter is not formed 
either since the rnk promoter remains 
intact following the rearrangement. 
This preexisting mk promoter is merely 
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Figure 9. A possible E. coli metabolic pathway for the catabolism of citrate under 


aerobic conditions. Unlike anaerobic catabolism of citrate, it may not be neces- 


sary for E. coli to co-metabolize glucose. Since wild-type strains of E.. coli do not 


catabolize citrate under aerobic conditions, many aspects of aerobic metabolism 


have yet to be investigated. 


placed upstream of citT,, where it drives 
transcription of both the hybrid rnk/ 
citG region and citT’. The fusion of mk/ 
citG does not form a new promoter or 
regulatory region. 

Quandt etal. (2015) detected several 
mutations that apparently contribute to 
a favorable physiological background 
for subsequent development of the 
Cit* phenotype. Mutations of gi/tA, 
iclR, and arcB all can increase acetate 
utilization (Leiby et al., 2014), which 
potentially provides higher levels of C,- 
dicarboxlates for cotransporting by CitT. 
While these mutations may constitute a 
necessary potentiating step, they are all 


loss-of-function mutations. As such, all 
can be categorized as antagonistic plei- 
otropy, but none serve as an example of 
“new gene” or “new regulatory element” 
formation. 

The location of the detA mutation 
indicates that it lies within the gene’s 
regulatory region, altering promoter 
site activity. Increased levels of dctA 
transcription (as indicated by levels of 
mRNA) were found in the Cit* mutants 
compared with transcription levels in 
the wild-type (Quandt et al., 2014). 
This is consistent with loss of regulatory 
control of the promoter so that the gene 
is overexpressed. 


E. coli B (the wild-type strain in the 
LTEE) cannot utilize C-dicarboxylates, 
such as succinate and fumarate, as en- 
ergy sources (Quandt et al., 2014). This 
phenotype is probably attributed to a 
lack of deuS expression in this strain of 
E. coli (Yoon et al., 2012). In a differ- 
ent strain (E. coli K-12), DcuS is part 
of a two-component regulatory system 
(DcuS-DcuR) that activates expression 
of C,-dicarboxylate utilization genes, 
including dcetA (Golby et al., 1999) 
(Figure 10A). Therefore, lack of DcuS 
production in FE. coli B prevents this 
strain from expressing detA. Quandt et 
al. (2014) speculate that a mutation in 
the Cit* cells permits expression of dctA, 
even in the absence of DcuS production 
(Figure 10B). The specific nature of the 
mutation, though, is not yet known. 

Nonetheless, the detA mutation ap- 
parently alters regulation of the gene. 
Such genotypes are common in bacte- 
ria, as promoters and other regulatory 
systems are easily altered by point or 
deletion mutations. While the exact 
nature of the dctA mutation has yet to 
be identified, there is little reason to 
assume this mutation is more than a 
deregulating event. Point or deletion 
mutations within the promoter’s core 
site or adjacent operator regions can 
result in altered promoter regulation 
(Gourse et al. 2000; Rhodius and Mu- 
talik, 2010). For example, increased 
activity is observed for some bacterial 
promoters following a point mutation 
within the -10 region (Han etal., 1991) 
or a single nucleotide deletion within 
the -35 region (Burchhardt et al., 1997). 

Cells need to maintain specific pro- 
moter activity to sustain physiologically 
appropriate levels of certain enzymes 
and other proteins. Mutations within the 
promoter/operator region can affect the 
binding affinity of a regulatory protein 
or the RNA polymerase. Either of these 
mutations can potentially cause overex- 
pression of the gene, which can upset 
the critical physiological balance need- 
ed by the cell. As such, these mutations 


Volume 52, Spring 2016 


319 





@) 


E. coli K-12 —______ 
dctA 


+ 


DcuS-DcuR 


DctA 


E. coli K-12 $$) —§ 
dctA 


+ 


DcuS-DcuR 


DctA 


é. cols ——$+ _ —> 
dctA 


DctA 





Figure 10. Comparison of detA activity in E. coli K-12 and E. coli B. (A) DcuS- 
DeuR activates the dctA promoter, enabling expression of the gene in E. coli K-12. 
E. coli B (the LTEE wild-type strain) lacks DcuS, causing the dctA promoter to 
remain inactive. (B) In Cit* mutants, a mutation (indicated by *) in the regulatory 


region of the dctA gene enables expression of the gene even in the absence of DcuS. 


can be categorized as a loss of function, 
since the cell can lose critical control 
of gene activity (e.g., overexpression is 
common in many cancer cells). In fact, 
point mutations that increase promoter 


activity may contribute to a variety of 
genetic diseases in humans (‘Theuns et 
al., 2006). ‘This is also illustrated in the 
observation that the detA mutation may 
be deleterious to the bacterium without 


the compensating increase of CitT pro- 
duction (Qunadt et al., 2014). 

The subsequent secondary muta- 
tions in Cit* strains (which negate the 
potentiating mutations) are also loss of 
function. Most of the g/tA2 mutations 
alter the amino acid sequence of citrate 
synthase, rendering the enzyme less 
active (Quandt et al., 2015). One of the 
gltA2 mutations (designated gltA2-R) is 
a point mutation near the gene’s pro- 
moter region, causing a reduction in gitA 
transcription (as measured by mRNA 
production) (Quandt et al., 2015). In 
addition, the aceA mutation eliminates 
isocitrate lyase activity (Quandt et al., 
2015). This compensates for the effect 
of the iclR and arcB mutations. 

All these secondary mutations serve 
to compensate by reducing the activity 
levels of certain enzymes. Each of these 
secondary mutations are antagonistic 
pleiotrophy — beneficial to the organism 
as it shifts metabolism to citrate but at 
the expense of preexisting enzyme ac- 
tivity. No new genes or promoters were 
formed, only loss of preexisting enzyme 
activity and promoter function. 

Therefore, the mutations detected in 
the Cit* organisms fail to provide a ge- 
netic mechanism for the origin of genes, 
promoter or operator sites, or the origin 
of any regulatory elements. Instead, 
they serve as an example of how loss of 
preexisting activity or the overexpression 
of specific genes can provide an adaptive 
advantage in specialized conditions. 


New Function? 

From their analysis, Blount et al. (2012, 
p. 517) conclude that the Cit* phenotype 
results from “the multi-step origin of a 
key innovation” producing “new func- 
tions.” Yet, E. coli already possesses the 
ability to utilize citrate for energy. The 
genes for all the needed enzyme and 
transport systems are already present. 

Potentially, the most straightforward 
mutation to achieve aerobic expression of 
the cit operon would be one that directly 
alters its regulation, thereby enabling 
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aerobic expression of the operon. The 
absence of this genotype in any Cit* 
mutants suggests such alterations may 
have overriding negative consequences. 
For example, the aerobic/anaerobic 
regulation of the cit operon is apparently 
a function of the redox sensitivity of CitA. 
In high redox conditions (i-e., aerobic), 
CitA is inactive. Perhaps eliminating this 
redox sensitivity reduces the overall activ- 
ity of CitA, thus the cit operon would still 
remain inactive in aerobic conditions. 

On the other hand, any mutation 
that places citT under control of an 
aerobically functioning promoter can 
offer at least a low-level Cit* phenotype. 
What likely makes the rnk/citGT fusion 
the most effective of the various pro- 
moter captures is that this fusion can 
occur in tandem repeats. Apparently the 
increased number of CitT’ copies that 
result from this duplication is necessary 
for a strong Cit* phenotype. The rk/ 
citGT fusion moves citT upstream of the 
mk promoter, but the promoter and the 
gene remain unchanged. 

The detA mutation apparently al- 
ters the promoter activity of the gene. 
This alteration removes some type of 
regulatory control of the gene, causing 
overexpression. As discussed above, this 
is a loss-of-function mutation. 

The benefit of this detA mutation 
is related to its epistatic relationship 
with the rnk/citGT fusion. Epistatic 
interactions of genes and mutations 
are not unique, per se, as bacteria have 
many—some with a positive affect 
(e.g., Trindade et al., 2009) and some a 
negative affect (e.g., Khan et al., 2011). 
Interestingly, negative epistasis appears 
to be more common in eukaryotic organ- 
isms than in bacteria (Xu et al., 2012). 
Hence, bacteria may be more capable of 
benefiting from an epistatic interaction 
of mutations than other organisms. 

While a number of different muta- 
tions were found in many of the Cit* 
mutants, only the detA mutation and 
mk/icitGT fusion were necessary for the 


full Cit* phenotype (Quandt etal., 2014). 


Yet, neither of these mutations can ap- 
propriately be classified as “birth” of a 
“complex new function ... from scratch.” 
A preexisting gene “captures” a preexist- 
ing promoter, and regulatory control of 
another preexisting promoter is lost. 

Ultimately, the development of the 
Cit* phenotype results from the reorgani- 
zation of preexisting genes, loss of preex- 
isting enzyme activity, and elimination 
of preexisting regulatory control. From 
a research perspective, the Cit* mutants 
offer a very interesting study in bacterial 
adaptation and genetic versatility (and 
certainly a “yeoman task” to elucidate). 
However, if the wild-type E. coli strain 
did not already possess these genetic sys- 
tems (such as citT and dctA), there have 
been no mutations observed during the 
entire LTEE that would have generated 
the Cit* phenotype “from scratch.” 

Since both glucose and citrate are 
present in the medium at very low levels, 
it is not surprising that any mutant that 
can effectively use both of these as an 
energy and carbon source would have 
a competitive advantage against the 
wild-type strain. Yet, the Cit* phenotype 
is possibly less advantageous to the bac- 
terium than initially concluded. In the 
original study, the Cit population even- 
tually became extinct in the presence of 
the Cit* strains. This was assumed to be 
a consequence of competition from the 
more-fit Cit* population but may have 
actually been a result of random experi- 
mental factors. In subsequent studies, 
populations of Cit cells (adapted to the 
LTEE medium) coexisted with Cit mu- 
tants, and no extinction occurred even 
after 2,500 generations (‘Turner et al., 
2015). While the Cit* mutants would ap- 
pear to have a distinct growth advantage 
over the Cit population, the advantage is 
apparently less than would be predicted. 
Perhaps the heavy mutational load of the 
Cit* strains exerts a sufficient toll that 
reduces the overall physiological benefit 
of being able to utilize both glucose and 
citrate. Perhaps other factors will eventu- 
ally be elucidated. 


Bacteria as a Model 

In the game of adaptation, bacteria have 
a decided advantage over most other 
organisms. For example, bacteria can 
sustain and survive (at least temporarily) 
a higher mutation rate than vertebrates 
(Denamur and Matic, 2006; Linz et al., 
2014). In fact, the hypermutator pheno- 
type that developed during the LTEE 
would potentially be lethal to vertebrate 
populations (Sniegowski et al., 2000). 
For bacteria, however, this high muta- 
tion rate may increase the probability 
that an adaptive mutation will occur 
(Wielgoss et al., 2013). 

Bacteria also have a much faster 
generation rate than vertebrates (optimal 
generation time frequently measured in 
minutes rather than years or decades). 
In fact, they outnumber all sexually 
reproducing populations by multiple 
orders of magnitude (Crawford, 2007), 
enabling them to roll the “mutation 
dice” quadrillions of times each day. In 
addition, bacteria have several genetic 
“tricks” they can employ for adaptation, 
such asa high level of genome plasticity 
and a vast pool of horizontally transfer- 
rable “plug-and-play” genes (Blount, 
2015; Lenski, 2004). Combined with 
a potential hypermutation phenotype 
and haploid genetics (i.e., one chromo- 
some, no heterozygous genotypes), these 
characteristics enable bacteria to often 
quickly adapt to different, even hostile, 
environments. As such, they are excel- 
lent engines of adaptation, explaining 
why they can be found in virtually ev- 
ery part of earth’s biosphere. All these 
characteristics also enable bacteria to 
pay a much higher “cost of substitution” 
than most organisms (for a more detailed 
discussion of this principle see ReMine, 
2005, 2006). 

This raises questions of how appli- 
cable bacterial adaptation studies are to 
“higher” organisms, such as plants or ani- 
mals. As haploid asexual organisms, their 
reproductive biology and inheritance 
differ significantly from sexual reproduc- 
tion and inheritance. Thus, the ability 
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of bacteria to genetically accomplish an 
adaptation does not mean other organ- 
isms can repeat this achievement. 

For example, attempting to apply the 
LTEE design to diploid sexual organ- 
isms, Burke et al. (2010) conducted a 
long-term experiment with Drosophila. 
They applied a fairly strong “selection” 
protocol for accelerated development 
and early fertility of the flies. After over 
600 generations of “selection,” the 
researchers found several phenotypes 
present in the experimental population, 
including a 20% faster rate of develop- 
ment from egg to adult. Yet, they found 
no unconditional beneficial mutation 
had become fixed within the population. 
Since the experimental conditions are 
much more intense than what Drosoph- 
ila would experience in a natural setting, 
the researchers note that it is even less 
likely any beneficial mutation would 
become fixed within a wild population 
in the same time frame. A subsequent 
experiment of Drosophila also failed to 
detect significant fixation of beneficial 
mutations following numerous genera- 
tions of “selection” (Orozco-terWengel 
etal., 2012). This illustrates how difficult 
it is for Drosophila to generate and fix 
just a single beneficial mutation, even 
under significant “selective” pressure. 


Conclusions 
From a microbiology and genetics 
perspective, the Cit* mutants are a fas- 
cinating collection of organisms. They 
offer many informative insights into 
the preprogrammed ability of bacteria 
to adapt to environmental limitations 
(such as the limited food source in the 
LTEE base medium). The multiple 
steps involved in the adaptation makes 
the mutants attractive as a mechanism 
for Darwinian evolution. However, as 
intriguing as this multistep sequence 
is as a study model, careful analysis of 
the genotype of these mutants reveals 
they are not an appropriate example 
of a “de novo origination of genes” or 


an organism building a new function 
“from scratch.” 

What new gene was formed? Genes 
involved in the potentiating step (gitA, 
iclR, and arcB) and the secondary muta- 
tions of the refining step (gltA and aceA) 
are already present in the wild-type bac- 
terium. Mutations of these genes reduce 
gene expression, enzyme activity, and 
regulatory controls. All these mutations 
would be categorized as degenerative — 
the opposite of “de nova origination of 
genes.” What is more, all genes involved 
in providing the full Cit* phenotype (mk, 
citG, citT’, and dctA) were also already 
present in the wild-type organism. Moy- 
ing citT upstream ofa different promoter 
does not constitute formation of a new 
gene. In addition, alteration of the 
regulatory site upstream of detA does not 
constitute a new gene, only the elimina- 
tion of the transcriptional control. Both 
citT and dctA remain structurally intact 
and their transcripts unchanged. Also, 
the rnk/citG fusion actually eliminates 
mk and citG as functioning genes. 

Where is the new regulatory ele- 
ment? The gitA2-R mutation reduces 
activity of the gitA promoter. This mu- 
tation lowers transcription levels of the 
gene. The reduction of transcription 
activity of a preexisting promoter does 
not provide a genetic mechanism for 
the origin of new promoters. Also, the 
mutations of icIR, and arcB eliminate 
the function of regulatory elements 
rather than form new elements. Note as 
well, the different promoters (those from 
IS3, fmB, and mk) captured for aerobic 
expression of citT were already present 
in the wild-type strain. Each of the pro- 
moters remains intact and unchanged. 
The “capture” merely places citT’ adja- 
cent to these preexisting promoters. In 
the same context, a mutation within the 
dctA promoter/operator region did not 
form a new promoter; rather the muta- 
tion eliminates preexisting regulatory 
control of that promoter. 

What is the new function? Utiliza- 
tion of citrate is not a new function for 


E. coli, only the environmental condi- 
tion of that utilization has changed (i.e., 
aerobic vs. anaerobic). This change oc- 
curred by (1) eliminating the regulation 
ofa preexisting system (i.e., deregulating 
dctA) and (2) removing citT from a pre- 
existing regulatory control and placing 
it under a different preexisting control. 
Therefore, claims of a new function are, 
at best, context dependent. 

Whether argued as a new function or 
not, fusion of a preexisting gene to a pre- 
existing promoter fails to offer a genetic 
mechanism for the origin of either the 
gene or the promoter. Nor does elimina- 
tion of a preexisting regulatory system 
serve as an example of how regulatory 
systems originated. Furthermore, dupli- 
cation of citT merely copies a preexisting 
gene and hence provides no insight into 
the origin of the duplicated gene. Even 
if Darwinists claim gene duplication al- 
lows for subsequent “evolution” of one of 
the duplicated genes, this did not occur 
in the Cit* mutants. The duplicated citT 
remains the same gene. 

The genotypes of the Cit* strains are 

interesting examples of how bacteria 
can rearrange preexisting genes, reduce 
preexisting enzyme activity, and alter 
regulation of preexisting promoters. 
However, mutations within these strains 
did not form new genes or promoters 
“from scratch.” Rather, they serve as an- 
other example of antagonistic pleiotropy, 
or loss of preexisting systems providing a 
temporal benefit. As such, the mutants 
fail to provide a genetic example of how 
genes, promoters, and regulatory systems 
originated. Thus, the Cit* mutants are 
not an example of the required genetic 
mechanism for neo-Darwinian trans- 
formation enabling universal common 
descent. 

Despite being an excellent engine 
for adaptation, bacteria fail to provide 
any examples of a genetic mechanism 
for new genes and regulatory controls. 
If bacteria are unable to offer such a 
mechanism, what basis is there to pre- 
sume that it can be attained by lesser 
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engines, such as mice, trees, fruit flies, 
and primates? The literature is filled 
with examples of phenotypic changes 
but barren of genetic examples enabling 
universal common descent. The failure 
of the Cit* mutants to provide such an 
example further reinforces the general 
failure of Darwinian evolution as an 
explanatory tool for the origin of genetic 
systems. 

In contrast, the Citt mutants fit 
appropriately within the predictions of 
a creation model. Organisms, such as 
bacteria, are preprogrammed to adapt 
to differing environments. This adapta- 
tion can involve eliminating preexist- 
ing regulatory controls, chromosome 
rearrangement, and even using external 
elements (such as IS3) for altered gene 
activity. Yet, the alterations were limited 
in scope and did not result in the de nova 
origination of new genes or regulatory 
elements. 
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Addendum 


A recent study demonstrates that FE. coli 
can acquire the needed citT’ and dctA 
mutations in less than 100 generations 
(Van Hofwegen et al., 2016). Employ- 
ing a different selection protocol than 
the LTEE, these researchers repeatedly 
obtained Cit* mutants. They conclude 
that the rarity of the Cit* mutants during 
the LTEE was strictly an artifact of the 
experimental conditions. Interestingly, 
Van Hotwegen et al. (2016) also arrive 
at the conclusion that no new genes 
were formed in the generation of the 
Cit* mutants. In fact, E. coli adaptation 
generally involves altering the regulatory 
control of pre-existing genes (Zinser and 
Kolter, 2004), and formation of new 
genes remains largely undocumented. 
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Howard Hughes Medical 


It has of- 
ten been said that 
natural selection may explain the sur- 


vival of the fittest, but it does not explain 
the arrival of the fittest. This video by 
the Howard Hughes Medical Institute 
(HHMI) attempts to address the latter. 
HHMI hosts a number of videos that 
promote both science and evolutionary 
philosophy, including this video that 
takes a look at the genetics underlying 
the adaptive loss of the pelvic spines in 
freshwater threespine sticklebacks. 

The video is well done. There is 
beautiful natural scenery, interviews 
conducted by Sean Carroll with two 
researchers, and quality graphics that 
are woven together in an engaging way. 
Carroll is a molecular biologist who has 
conducted research on gene regulation 
in fruit fly development. In 2010 he was 
named vice president for science educa- 
tion of the Howard Hughes Medical 
Center, and he has played an important 
role in the production of the HHMI 
Biointeractive videos. 

The first scientist interviewed is 
Mike Bell, who has done important work 
evaluating patterns of morphological 
variation in threespine sticklebacks. It 
is believed that these freshwater stickle- 
backs are descended from anadromous 
sticklebacks, which migrate from the 
ocean to freshwater streams to spawn. 
Somehow, they were cut off from the 





The Making 
of the Fittest: 
Evolving Switches, 
Evolving Bodies 
(Video) 








Institute Biointeractive, 


http://www. hhmi.org/ 
biointeractive/making- 
fittest-evolving-switches- 
evolving-bodies 





ocean and have adapted to full-time 
life in their current freshwater habitat. 
In doing so they lost their pelvic spines, 
which, though helpful in the ocean, 
can be a liability in the lake as hungry 
dragonfly larva grab them when attempt- 
ing to make a meal of the fish. Except 
for the timescale, this proposed history 
is compatible with a biblical worldview. 

Next, David Kingsley is interviewed, 
and the fascinating story is told of how he 
identified the underlying genetic differ- 
ence that accounts for the loss of pelvic 
spines. Interestingly, the threespine 
stickleback is designed with multiple 
regulatory elements that control expres- 
sion of the identified gene (pitx]) in 
different tissues. One regulatory region 
was deleted in the freshwater fish, and 
further experiments confirmed that it 
controlled expression of the gene in the 
pelvic region. The fact that other regula- 
tory regions control pitx] expression in 
other critical tissues allows for the loss of 
this trait without killing the fish. What 
incredible design that allows for adap- 
tation! Yet those involved in the video 
seem to completely miss this evidence 
of design because of their evolutionary 
worldview. 

In the case of sticklebacks, freshwa- 
ter species from a number of locations 
around the world have the same regu- 
latory region deleted. Thus, it appears 
that the same basic adaptive strategy 
has been used multiple times. While 


at one time evolutionists believed this 
type of repeated evolution unthinkable 
because of their underlying assump- 
tions of random mutation and natural 
selection, they have now been forced to 
recognize that this does occur (Brodie, 
2010). Rather than seriously scrutinize 
their assumptions, biologists have gen- 
erally marveled at how this shows the 
power of natural selection. In doing so, 
they ignore the realistic mathematical 
modeling that demonstrates that natural 
selection does not have these magical 
powers; it is not efficient at fixing most 
beneficial mutations or removing most 
deleterious ones (Lightner, 2015). While 
being enamored by the supposed powers 
of natural selection, they are distracted 
from productive research questions that 
could enable them to find the underly- 
ing basis of this remarkably repeatable 
genetic transformation made by these 
fish when trapped in a freshwater en- 
vironment. 

The video story line turns back to 
Mike Bell and the work he has done 
looking for stickleback fossils in Nevada. 
There are tens of thousands of layers of 
rock in a dried-up lake bed that contains 
many stickleback fossils. The deeper lay- 
ers have an abundance of sticklebacks 
with spines, but there is a sudden shift to 
fish without spines after a few thousand 
layers. Since the layers are interpreted as 
being annual (varves), it is claimed this 
represents thousands of years. Yet work 
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by Mike Bell and others suggests these 
transformations frequently occur within 
a few decades (Bell et al., 2004; Lescak 
eral, 201), 

Notice that the observed transforma- 
tion documented in the literature is hap- 
pening several orders of magnitude faster 
than what is claimed by the evolutionists 
based on the fossil record. Something is 
wrong, namely the assumption that the 
layers are annual (varves). There are other 
dried up lake beds (e.g., Green River 
Formation in Wyoming) where it is very 
clear the layers cannot be annual (Oard 
and Whitmore, 2006). Also, the excellent 
preservation makes it clear that the fish 
were deposited rapidly, before significant 
decay or scavenging could occur (Whit- 
more, 2006). So empirical evidence gives 
us strong reason to doubt the age assigned 
by the evolutionists; in reality it occurred 
well within a biblical timeframe. 

At the beginning of the video Sean 
Carroll mentions that the changes in 
sticklebacks provide evidence of how all 
creatures evolve. If this is so, it is power- 
ful evidence against molecules-to-man 


evolution. For these sticklebacks, traits 
(e.g., body armor and pelvic spines) are 
repeatedly lost when the fish become 
trapped in a freshwater environment 
and, in the case of pelvic spines, the 
underlying regulatory genetic sequence 
is deleted. These are changes headed 
in the wrong direction to support evo- 
lutionary beliefs on origins. Changes 
in regulatory sequences require that 
such sequences exist and that they are 
designed in such a way that adaptive 
change is possible. Thus, logically, life 
must have begun with an astonishingly 
complex design that allows for adaptive 
changes. This is consistent with bibli- 
cal creation, not evolution. Further, 
the changes occur much more rapidly 
and predictably than evolutionists had 
previously thought possible. This sug- 
gests that designed mechanisms, not just 
random mutation and natural selection, 
are likely involved. 
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This video is one of 


the many Biointeractive videos 
that promote both science and an 
evolutionary worldview. Sean Carroll 
accompanies Michael Nachman as they 
discuss the adaptive coloration patterns 
in pocket mice in New Mexico. Carroll 


strongly promotes evolution, and in 
particular natural selection, as the major 
mechanism by which adaptive evolu- 
tionary changes take place. He feels 
these mice are an excellent example of 
adaptation by natural selection. 
Nachman has studied mice on the 
black rocks of a lava flow known to be 


about 1000 years old, as well as mice 
found in the surrounding desert. The 
mice on the lava flow are black and 
blend in well with the color of the rocks. 
The mice living on the lighter rocks of 
the desert have a lighter color. In each 
case, the matching color makes them 
less of a target for predators. 
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Genetic evaluation has shown that 
black mice have a mutation in the melr 
gene. Four nucleotide bases differ be- 
tween them and their light-colored rela- 
tives. This is an example of rapid adapta- 
tion. Interestingly, black mice from older 
lava flows elsewhere in the Southwest do 
not have this mutation. Thus, while we 
see rapid adaptation, much like in the 
sticklebacks in the Evolving Switches, 
Evolving Bodies video, the gene targeted 
is not always the same. 

Again, the evolutionists in this 
video do not seem to recognize that 
this adaptation required a preexisting 
complex system that was designed in a 
way to allow for adaptive color changes. 
Most mice can produce both a lighter 
pigment (yellow to red) and a darker 
pigment (dark brown to black). In fact, 
if you look at a hair from the back of a 
light-colored mouse, you will notice a 
wide light band between two dark ones. 
The light band is missing from the dark 
mice, since they produce only the dark 
pigment (Nachman, 2005). 

The mutation basically keeps the 
signal for the darker pigment turned 
on, making the mclr unresponsive to 
the upstream signal that would cause 
the banding pattern. So while it is adap- 
tive, it takes advantage of preexisting 
complexity. This is consistent with the 
biblical worldview of an all-wise Creator 
who made and sustains life (even in this 
fallen world). However, it is a problem 
for the evolutionists because adaptive 
changes do not build these impressively 
complex systems, leaving the evolution- 
ists with no plausible way to account 
for them. 

As I have pointed out elsewhere, 
natural selection cannot be a major 
player in accounting for the majority of 
patterns of adaptation we see in animals 
around us (Lightner, 2015). Itneither ac- 
counts for the origin of adaptive diversity, 
nor is a particularly effective mechanism 
at fixing beneficial alleles under most 
circumstances. However, that does not 


mean that it never plays a role. In the 
case of the pocket mice, it probably does 
play at least some role in maintaining the 
distinct color differences between mice 
in these neighboring regions. 

Several unsubstantiated claims are 
made in the video. One is that muta- 
tions are random. This is a foundational 
assumption of neo-Darwinism, but it has 
long been known that mutations tend 
to occur at hotspots. They are not really 
random as to when or where they occur 
(Noble, 2013). 

A second unsubstantiated claim is 
that the mice have no preference for a 
light or dark background; it is the preda- 
tors that make all the difference. Mice 
do occasionally cross the boundaries, 
and there is gene flow between the light 
and dark populations (Hoekstra et al., 
2005). However, this is insufficient to 
support this strong assertion. There does 
not seem to be serious consideration of 
factors other than natural selection that 
may have contributed to the pattern seen 
(e.g., dark mice choosing to migrate into 
the area with dark rocks resulting in a 
founder effect; the possibility of non- 
Mendelian inheritance such as biased 
gene conversion, etc.). 

This brings up a final point. Carroll 
claims that accounting for this pattern is 
simple when it comes to the math. He 
says that ifone mouse in 100,000 is born 
with this black coloration and there are 
hundreds of thousands born each year, 
and if the black color gives that mouse a 
10% advantage, then it only takes about 
100 years for this trait to take over the 
population. He is ignoring genetic drift 
and one other big problem. How does 
this black mouse get born on the right 
background? One would have needed 
a sizable population of light mice on 
the dark background to have the rare 
mutant born there, yet this seems un- 
likely. If the dark mouse is born on a 
light background, it is at a disadvantage. 
If it does not have a preference for either 


background, how do we get this mouse 
to enter the environment for which it 
is best suited? Something significant 
appears to be missing from Carroll’s 
explanation. 

I very much enjoyed watching the 
video, especially since coat color genet- 
ics is one area of interest for me. In fact, 
it was my in-depth study of the melr in 
other animals that made it increasingly 
obvious to me that random mutation 
plus natural selection cannot account for 
many of the patterns we see (Lightner, 
2008). The video is nicely done, but 
I do not feel it is suitable for showing 
to students without addressing world- 
view issues. As is common elsewhere, 
reasonable scientific conclusions and 
naturalistic evolutionary philosophy are 
boldly proclaimed with the same level 
of certainty. Students need help being 
able to sort through these issues, but the 
videos are only helpful for that if there 
is thought-provoking critical analysis 
that follows. 
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Author 
Zingaro was reared 
a Catholic and attended Catholic 
Schools from kindergarten through 
eighth grade (p. 8). He claims that not 
once during his Catholic education did 
he read the Bible in school, nor was it 
read in church. He relates that the first 
time he read a Bible was at age 28 (p. 
12). Zingaro began losing his Catholic 
faith when he was in college studying 
journalism, and he soon joined a Prot- 
estant church. He was also at this time 
in life a missionary in Africa and active 
in Bible teaching. 

He left Christianity due to the indoc- 
trination in evolution received when a 
student at the Pittsburgh Presbyterian 
Theological Seminary. It seemed the 
most important thing learned at the 
Presbyterian school was to not trust the 
biblical record. He writes that one day 
when reading a commentary of “one of 
my highly respected professors ... Old 
Testament scholar, Dr. Donald Gowan 
... wrote matter-offactly in a commen- 
tary on Genesis that Adam and Eve had 
not existed as real individuals. Rather, 
they were merely symbolic characters 
representing all of humankind” (p. 29). 

In seminary, Zingaro ended up 
believing the Bible was not a product 


of God’s revelation but rather was writ- 
ten by people who were “swayed by the 
prejudices of their day” (p. 128). As a 
result of this and other instruction in 
the seminary, his faith was shaken to the 
point that only a small part of Orthodox 
Christianity remained. By the time his 
seminary experience ended, Zingaro 
writes that he was on his way to becom- 
ing a full-fledged Darwinist ‘p. 128), 
which is the topic of most of his 406- 
page book (p. 83). He concludes that 
the evidence scientists “saw with their 
own eyes did not match the stories in the 
Scriptures” (p. 86). His descent into full- 
fledged Darwinism included accepting 
the conclusion that “natural Selection is 
ultimately the creator of all life.” 

Nonetheless, Zingaro was ordained 
as a Presbyterian minister in 1994 and 
has been the pastor of First Presbyterian 
Church of Newton, New Jersey since 
2007. Previous to this he served for 13 
years at the Bryn Mawr Presbyterian 
Church in Cottage Grove, Wisconsin, 
a suburb of Madison. 

Much of this book (pp. 161-379) 
is about the Dover Intelligent Design 
trial (which Zingaro incorrectly calls 
a federal court case) concerning those 
who take the Bible literally and those 
who do not (p. 5). He quotes extensively 
from the trial documents in an effort to 
show that Darwinism has been proven to 
be true, and no credible opposition to it 


exists. He spends much time bemoaning 
the fact that many people reject what 
he calls the proven fact of Darwinism 
(p. 151). 

The judge in the Dover case ruled in 
favor of those who reject the biblical re- 
cord, thus violating the first amendment 
of the Constitution, which requires the 
state to be religiously neutral. Zingaro 
concludes that the “fundamentalists 
[creationists] are deluded” (p. 5) and he 
now totally accepts the current orthodox 
evolutionist claims—with little or no 
awareness or understanding of the other 
side and the problem with many of the 
evolutionary claims. In short, although 
a long book, Zingaro gives very few 
valid scientific arguments to support 
his conclusions about Darwinism. He 
gives very little space weighing the pro 
and con arguments for the views he 
discusses. Although he had many close 
female friends, Zingaro later revealed 
that he accepted the gay lifestyle and 
contracted colon cancer, a disease that 
is common among sexually active male 
homosexuals (p. 135). This book is yet 
another well-documented example of a 
lack of training in apologetics, leading to 
indoctrination into secular humanism 
and Darwinism. 
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For 40 years, Peter 


and Rosemary Grant conduct- 
ed intensive field research on finches 
in the Galapagos Islands. While many 
biologists study populations and make 
inferences about what happened in 
the past, the Grants actually individu- 
ally identified, measured, and followed 
finches on Daphne Major Island. This 
type of prospective study gave them a 
significant advantage in assessing how 
natural selection and other factors affect 
a population. The bulk of their work 
was with the medium ground finch, 
although they also collected data on 
several other species. The results of their 
work changed much of what was previ- 
ously believed about natural selection 
and speciation. 

The medium ground finch was of 
primary interest because it was variable 
in a number of traits, especially beak 
size. Natural selection can work only if 
there is variability, so they were a logical 
choice for research on this topic. The 
Grants found that natural selection 
wasn’t constant but operated strongly 
during specific periods of time when 
the environment changed, namely, dur- 
ing droughts. During the most severe 
droughts, many birds died, and the size 
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of the beak was the trait that made 
the difference between life and 
death. Thus, it can be said that 
natural selection targeted beak 
size. As a result, the average beak 
size changed in the population. 
The change in the mean of a trait is 
what the Grants refer to as evolution. 

Interestingly, natural selection did 
not always operate in the same direc- 
tion. In some years the small seeds were 
depleted and the death toll was greatest 
among birds with smaller beaks. How- 
ever, one drought was preceded by some 
very wet years that caused the island to 
be overrun by plants bearing small seeds. 
In that drought it was birds with larger 
beaks that were disadvantaged because 
their food source was depleted first. 
The reality that natural selection oscil- 
lates in direction has some important 
implications. 

The Grants talk about natural selec- 
tion as a driving force for divergence. 
And indeed, they explain in detail in 
the book how natural selection changed 
the average beak size in the medium 
ground finches. However, this does not 
mean that natural selection was helping 
the birds adapt. In fact, it was reducing 
useful variety. The food sources returned 
after the return of the rains, yet many of 
the birds with a beak size ideal for ex- 
ploiting the food source had died when 
natural selection was operating. 

Such oscillating weather patterns 
that underlie natural selection can 
actually hinder adaptation by putting a 


population at serious risk. However, the 
Grants found another factor that affected 
the average beak size of the birds —hy- 
bridization. In many years a small num- 
ber of individuals from the population 
bred with members of another species. 
As their offspring backcrossed with one 
of the parental species, new alleles were 
brought into the population, and varia- 
tion was increased. 

Hybridization can have a number of 
different long-term effects, depending 
on the specifics involved. In some cases 
two different species can coalesce, the 
reverse of speciation. At other times the 
hybrids may breed among themselves 
rather than backcross with one of the 
parent species. In doing so they may be- 
come a new species. The Grants found 
both these patterns among the finches 
they studied. In the case of the cactus 
finch, hybridization had a much stronger 
influence on body size and beak length 
than selection, which, when detected, 
was in the opposite direction than the 
population evolved during their study. 

The Grants expand on the signif- 
cance of hybridization and its relation- 
ship to speciation over several chapters. 
Itis not just their work but also a number 
of other studies that have shown that 
hybridization is important in the natural 
history of many species. In fact, it is now 
believed that hybridization is likely a 
critical catalyst for adaptive radiations, 
where organisms rapidly diversify and fill 
a variety of environmental niches, such 
as the finches have done in the Galapa- 
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gos (Abbott et el., 2013). Hybridization 
appears to allow for the introduction of 
essential genetic diversity that allows for 
such adaptive responses to the environ- 
ment. 

The Grants also made some impor- 
tant discoveries about factors associated 
with a species colonizing a new area. 
They had noticed large ground finches 
sometimes visited Daphne Island, 
but they never stayed to breed until 
1982-83, a time when conditions were 
very favorable. The initial population of 
two females and three males produced 
young, but only a brother/sister pair 
survived to breed. Subsequently, there 
were signs of inbreeding depression until 
other immigrants of the species joined 
them. The large ground finches that 
contributed to this colonization came 
from at least three different islands. 

One discovery they found particu- 
larly surprising was that the birds who 
stayed to form the breeding colony 
were genetically different from those 
that chose to leave before the breeding 
season. ‘This is in opposition to what 
biologists often assume. Thus, habitat 


Letters to the Editor 


choice (by the birds) and environmental 
uncertainties (which may result in natu- 
tal selection) both played a role in the 
initial establishment of the population. 
In chapter 6 the Grants cover various 
factors, including the founder effect, 
immigration, and genetic drift, which 
contribute to the genetic makeup of the 
new population. 

Since the major dry season food 
source for the large ground finch is 
also an important food source for the 
large-beaked medium ground finches, 
it was expected that competition might 
end up affecting the birds. Indeed this 
occurred 22 years after the large ground 
finch population was established. Their 
numbers had greatly increased when 
another drought hit in 2003-2004. 
There was a very high death toll in both 
large and medium ground finches. No 
selection was evident in the large ground 
finch population, but the larger beaked 
medium ground finches were more 
seriously disadvantaged than those with 
smaller beaks in their population. Since 
beak size is highly heritable, this was 
reflected in an unprecedented decrease 


in average beak size. In later years, when 
the larger seeds again were plentiful, the 
medium ground finch population did 
not recovery the lost variability to enable 
them to exploit the resource. 

The book is an excellent overview 
of the Grants’ 40 years of research. It 
is well written and well laid out, but 
there are enough technical details that 
are relevant that it is not an easy read. 
There is a helpful summary at the end 
of each chapter, appendices with extra 
information, a list of references, and a 
subject index. Although the authors are 
evolutionists who hold to the secular 
old-earth timescale, this is still a valu- 
able book that can provide creationists 
with important insights into some of the 
factors affecting the natural history of 
created kinds. 


Reference 
Abbott, R., et al. 2013. Hybridization and 
speciation. Journal of Evolutionary Biol- 


ogy 26(2): 229-246. 


Jean Lightner 
jklightner@gmail.com 


The policy of the editorial staff of CRSQ is to allow letters to 


the editor to express a variety of views. As such, the content 
of all letters‘is'solely the opinion of the author, and does not 
necessarily reflect the opinion of the CRSOQ editorial staff or 


the Creation Research Society. 


This letter ought to have appeared with Andrew Snelling’s letter in the Fall 2015 issue. I apologize for this oversight. 


Reply to CRSQ Letter to the Editor, “iDINO Corrections” 


In CRSQ 52(2):151, Dr. Snelling iden- 
tified a place name error in our article 
titled “Radiocarbon in Dinosaur and 
Other Fossils.” In CRSO 51(4):306 we 
wrote, “Igneous petrologist Andrew Snel- 
ling carbon dated fossil wood extracted 


from the middle Triassic Hawkesbury 
Sandstone of Queensland.” This should 
instead have read, “Igneous petrologist 
Andrew Snelling carbon dated fossil 
wood extracted from the Newcastle 
Coal Measures in the Sydney Basin.” 


—Danny R. Faulkner, Editor 


Incidentally, the result under discussion 
appears in Figure 6 under “coalified 
bark.” However, this correction does not 
alter our conclusions. 


Brian Thomas and Vance Nelson 
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Submission 


Electronic submissions of all manuscripts and graphics are pre- 
ferred and should be sent to the editor of the Creation Research 
Society Quarterly in Word, WordPerfect, or Star-Office/Open 
Office (see the inside front cover for address). Printed copies 
also are accepted. If submitting a printed copy, an original plus 
two copies of each manuscript should be sent to the editor. ‘The 
manuscript and copies will not be returned to authors unless 
a stamped, self-addressed envelope accompanies submission. 
If submitting a manuscript electronically, a printed copy is 
not necessary unless specifically requested by the Quarterly 
editor. Manuscripts containing more than 35 pages (double- 
spaced and including references, tables, and figure legends) 
are discouraged. An author who determines that the topic 
cannot be adequately covered within this number of pages is 
encouraged to submit separate papers that can be serialized. 

All submitted manuscripts will be reviewed by two or 
more technical referees. However, each section editor of the 
Quarterly has final authority regarding the acceptance of a 
manuscript for publication. While some manuscripts may be 
accepted with little or no modification, typically editors will 
seek specific revisions of the manuscript before acceptance. 
Authors will then be asked to submit revisions based upon 
comments made by the referees. In these instances, authors 
are encouraged to submit a detailed letter explaining changes 
made in the revision, and, if necessary, give reasons for not 
incorporating specific changes suggested by the editor or 
reviewer. If an author believes the rejection of a manuscript 
was not justified, an appeal may be made to the Quarterly 
editor (details of appeal process at the Society’s web site, www. 
creationresearch.org). 

Authors who are unsure of proper English usage should 
have their manuscripts checked by someone proficient in the 
English language. Also, authors should endeavor to make 
certain the manuscript (particularly the references) conforms 
to the style and format of the Quarterly. Manuscripts may be 
rejected on the basis of poor English or lack of conformity to 
the proper format. 

The Quarterly is a journal of original writings, and only 
under unusual circumstances will previously published mate- 
rial be reprinted. Questions regarding this should be submitted 
to the Editor (CRSQeditor@creationresearch.org) prior to 
submitting any previously published material. In addition, 
manuscripts submitted to the Quarterly should not be concur- 
rently submitted to another journal. Violation of this will result 
in immediate rejection of the submitted manuscript. Also, if 
an author uses copyrighted photographs or other material, a 
release from the copyright holder should be submitted. 


Appearance 

Manuscripts shall be computer-printed or neatly typed. Lines 
should be double-spaced, including figure legends, table 
footnotes, and references. All pages should be sequentially 
numbered. Upon acceptance of the manuscript for publica- 
tion, an electronic version is requested (Word, WordPerfect, 
or Star-Office/Open Office), with the graphics in separate 
electronic files. However, if submission of an electronic final 
version is not possible for the author, then a cleanly printed 
or typed copy is acceptable. 

Submitted manuscripts should have the following organi- 
zational format: 
1. Title page. This page should contain the title of the manu- 
script, the author’s name, and all relevant contact information 
(including mailing address, telephone number, fax number, 
and e-mail address). Ifthe manuscript is submitted by multiple 
authors, one author should serve as the corresponding author, 
and this should be noted on the title page. 
2. Abstract page. This is page | of the manuscript, and should 
contain the article title at the top, followed by the abstract for 
the article. Abstracts should be between 100 and 250 words 
in length and present an overview of the material discussed in 
the article, including all major conclusions. Use of abbrevia- 
tions and references in the abstract should be avoided. This 
page should also contain at least five key words appropriate 
for identifying this article via a computer search. 
3. Introduction. The introduction should provide sufficient 
background information to allow the reader to understand the 
relevance and significance of the article for creation science. 
4. Body of the text. Two types of headings are typically used 
by the CRSQ. A major heading consists of a large font bold 
print that is centered in column, and is used for each major 
change of focus or topic. A minor heading consists of a regular 
font bold print that is flush to the left margin, and is used fol- 
lowing a major heading and helps to organize points within 
each major topic. Do not split words with hyphens, or use all 
capital letters for any words. Also, do not use bold type, except 
for headings (italics can be occasionally used to draw distinc- 
tion to specific words). Italics should not be used for foreign 
words in common usage, e.g., “et al.”, “ibid.”, “ca.” and “ad 
infinitum.” Previously published literature should be cited us- 
ing the author’s last name(s) and the year of publication (ex. 
Smith, 2003; Smith and Jones, 2003). If the citation has more 
than two authors, only the first author’s name should appear 
(ex. Smith et al., 2003). Contributing authors should examine 
this issue of the CRSQ or consult the Society’s web site for 
specific examples as well as a more detailed explanation of 
manuscript preparation. Frequently-used terms can be abbrevi- 
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ated by placing abbreviations in parentheses following the first 
usage of the term in the text, for example, polyacrylamide gel 
electrophoresis (PAGE) or catastrophic plate tectonics (CPT). 
Only the abbreviation need be used afterward. If numerous 
abbreviations are used, authors should consider providing a 
list of abbreviations. Also, because of the variable usage of 
the terms “microevolution” and “macroevolution,” authors 
should clearly define how they are specifically using these 
terms. Use of the term “creationism” should be avoided. All 
figures and tables should be cited in the body of the text, and 
be numbered in the sequential order that they appear in the 
text (figures and tables are numbered separately with Arabic 
and Roman numerals, respectively). 
5. Summary. A summary paragraph(s) is often useful for 
readers. The summary should provide the reader an overview 
of the material just presented, and often helps the reader to 
summarize the salient points and conclusions the author has 
made throughout the text. 
6. References. Authors should take extra measures to be certain 
that all references cited within the text are documented in 
the reference section. These references should be formatted 
in the current CRSQ style. (When the Quarterly appears in 
the references multiple times, then an abbreviation to CRSQ 
is acceptable.) ‘The examples below cover the most common 
types of references: 

Robinson, D.A., and D.P. Cavanaugh. 1998. A quantitative approach 
to baraminology with examples from the catarrhine primates. 
CRSO 34:196-208. 

Lipman, E.A., B. Schuler, O. Bakajin, and W.A. Eaton. 2003. 
Single-molecule measurement of protein folding kinetics. Sci- 
ence 301:1233-1235. 

Margulis, L. 1971a. The origin of plant and animal cells. American 
Scientific 59:230-235. 

Margulis, L. 1971b. Origin of Eukaryotic Cells. Yale University Press, 
New Haven, CT. 

Hitchcock, A.S. 1971. Manual of Grasses of the United States. Dover 
Publications, New York, NY. 

Walker, T.B. 1994. A biblical geologic model. In Walsh, R.E. (editor), 
Proceedings of the Third International Conference on Creationism 
(technical symposium sessions), pp. 581-592. Creation Science 
Fellowship, Pittsburgh, PA. 

7. Tables. All tables cited in the text should be individually 
placed in numerical order following the reference section, and 
not embedded in the text. Each table should have a header 
statement that serves as a title for that table (see a current issue 
of the Quarterly for specific examples). Use tabs, rather than 
multiple spaces, in aligning columns within a table. ‘Tables 
should be composed with 14-point type to insure proper ap- 
pearance in the columns of the CRSQ. 

8. Figures. All figures cited in the text should be individually 

placed in numerical order, and placed after the tables. Do 


not embed figures in the text. Each figure should contain 
a legend that provides sufficient description to enable the 
reader to understand the basic concepts of the figure without 
needing to refer to the text. Legends should be on a separate 
page from the figure. All figures and drawings should be of 
high quality (hand-drawn illustrations and lettering should be 
professionally done). Images are to be a minimum resolution of 
300 dpi at 100% size. Patterns, not shading, should be used to 
distinguish areas within graphs or other figures. Unacceptable 
illustrations will result in rejection of the manuscript. Authors 
are also strongly encouraged to submit an electronic version 
(.cdr, .cpt, .gif, jpg, and .tifformats) of all figures in individual 
files that are separate from the electronic file containing the 
text and tables. 


Special Sections 
Letters to the Editor: 


Submission of letters regarding topics relevant to the Society 
or creation science is encouraged. Submission of letters com- 
menting upon articles published in the Quarterly will be 
published two issues after the article’s original publication 
date. Authors will be given an opportunity for a concurrent 
response. No further letters referring to a specific Quarterly 
article will be published. Following this period, individuals 
who desire to write additional responses/comments (particu- 
larly critical comments) regarding a specific Quarterly article 
are encouraged to submit their own articles to the Quarterly 
for review and publication. 


Editor's Forum: 

Occasionally, the editor will invite individuals to submit differ- 
ing opinions on specific topics relevant to the Quarterly. Each 
author will have opportunity to present a position paper (2000 
words), and one response (1000 words) to the differing position 
paper. In all matters, the editor will have final and complete 
editorial control. Topics for these forums will be solely at the 
editor’s discretion, but suggestions of topics are welcome. 


Book Reviews: 
All book reviews should be submitted to the book review edi- 
tor, who will determine the acceptability of each submitted 
review. Book reviews should be limited to 1000 words. Follow- 
ing the style of reviews printed in this issue, all book reviews 
should contain the following information: book title, author, 
publisher, publication date, number of pages, and retail cost. 
Reviews should endeavor to present the salient points of the 
book that are relevant to the issues of creation/evolution. Typi- 
cally, such points are accompanied by the reviewer’s analysis of 
the book’s content, clarity, and relevance to the creation issue. 
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in 1963, with Dr. Walter E.. Lammerts as first president 
and editor of a quarterly publication. Initially started as 
an informal committee of 10 scientists, it has grown rap- 
idly, evidently filling a need for an association devoted 
to research and publication in the field of scientific 
creation, with a current membership of over 600 voting 
members (graduate degrees in science) and about 1000 
non-voting members. The Creation Research Society 
Quarterly is a peer-reviewed technical journal. It has 
been gradually enlarged and modified, and is currently 
recognized as one of the outstanding publications in the 
field. In 1996 the CRSQ was joined by the newsletter 
Creation Matters as a source of information of interest 
0 creationists. 
Activities—The Society is a research and publication 
society, and also engages in various meetings and 
promotional activities. There is no affiliation with any 
other scientific or religious organizations. Its members 
conduct research on problems related to its purposes, 
anda research fund and research center are maintained 
o assist in such projects. Contributions to the research 
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fund for these purposes are tax deductible. As part of its 
vigorous research and field study programs, the Society 
operates The Van Andel Creation Research Center in 
Chino Valley, Arizona. 

Membership —Voting membership is limited to scien- 
tists who have at least an earned graduate degree in a 
natural or applied science and subscribe to the State- 
ment of Belief. Sustaining membership is available 
for those who do not meet the academic criterion for 
voting membership, but do subscribe to the Statement 
of Belief. 
Statement of Belief—Members of the Creation 
Research Society, which include research scientists 
representing various fields of scientific inquiry, are com- 
mitted to full beliefin the biblical record of creation and 
early history, and thus to a concept of dynamic special 
creation (as opposed to evolution) both of the universe 
and the earth with its complexity of living forms. We 
propose to re-evaluate science from this viewpoint, and 
since 196+ have published a quarterly of research articles 
in this field. All members of the Society subscribe to the 
following statement of belief: 





1. The Bible is the written Word of God, and because it 
is inspired throughout, all its assertions are historically 
and scientifically true in all the original autographs. To 
he student of nature this means that the account of 
origins in Genesis is a factual presentation of simple 
istorical truths. 

2. All basic types of living things, including humans, 
were made by direct creative acts of God during the 
Creation Week described in Genesis. Whatever bio- 
ogical changes have occurred since Creation Week 
have accomplished only changes within the original 
created kinds. 

3. The Great Flood described in Genesis, commonly 
referred to as the Noachian Flood, was a historical event 
worldwide in its extent and effect. 

4. We are an organization of Christian men and women 
of science who accept Jesus Christ as our Lord and Sav- 
ior. The act of the special creation of Adam and Eve as 
one man and woman and their subsequent fall into sin 
is the basis for our belief in the necessity of a Savior for 
all people. Therefore, salvation can come only through 
accepting Jesus Christ as our Savior. 
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‘How can pliable, stretchable tissue survive inside 
dinosaur fossils for over 65 million years? 


How can this tissue still contain intact cells and 
even dinosaur proteins? 


How can this fragile biological material survive 


A fragment of the Triceratops brow horn. for so long? 
Fragments, such as this one, 
still contain tissue and cells. The answer to these questions directly challenges the current, 


evolutionary-biased, geologic timescale. 











The Creation Research Society began its iDINO research initia- 
tive for the purpose of studying soft tissue in dinosaur fossils. 
The first phase of the project detected pliable, unfossilized tissue 
in a brow horn of a Triceratops. Within this tissue were intact os- 
teocytes (bone cells). Some results from the iDINO project have 
been published in a technical microscopy journal and presented 
at an international microscopy conference. The Spring 2015 issue 
of the Creation Research Society Quarterly also features a special 
report of the iDINO project. Plus, to further spread the important 
information about soft tissue, the Society is developing a video 





Microscopic examination of tissue (Echoes of the Jurassic). 
extracted from a Triceratops horn 
reveals bone cells still present. The second phase of the project ({DINO II) will look more 


extensively at the process of tissue preservation. Evolutionists 
have offered various theories of how this tissue could survive for 
millions of years. iDINO II will methodically investigate these 
preservation claims, assessing their plausibility. 


The iDINO results have already provided a strong challenge to 
the evolutionary worldview. More extensive and detailed ex- 
amination may provide even stronger evidence that the age of 
dinosaur fossils is far less than 65 million years. To this end, the 
Society continues to seek those willing to fund this project with 





either one-time gifts or monthly donations. 


Electron microscope picture of For more information contact us at (928) 636-1153 or crsvarc@crsvarc.com. 


intact bone cells still in tissue 
extracted from a Triceratops horn. Also visit http://tinyurl.com/nphm2c4 for project updates and details. 





ts 





